Frequently asked questions
Last updated 5th July 2023
How does ScribeAI work?
ScribeAI uses an open source AI model from OpenAI called Whisper. Whisper consists of a type of model known as a speech-to-text model (STT). There are 3 models in the Whisper family which we use: Base, Small and Medium. Base is the least accurate, but runs the quickest, Medium is the most accurate but runs slower. We run this on your device using the GGML open source library with some heuristics and code adjustments to make things run faster, more efficiently and the transcriptions more accurate.
Do you collect any data?
We do not collect any user data from the app. All audio processing, transcription, and translation takes place entirely on the device. No data is transferred off the device for processing. Please see our privacy policy for further information.
What is the difference the between Rapid, Balanced and Precise models?
Rapid, Balanced, and Precise are the names of the three different AI models that power the transcribing and translation services of ScribeAI. Each model presents trade-offs among accuracy, speed, and battery usage. "Rapid" is powered by Whisper Base; it's the fastest model and operates in real-time on most devices. It performs best in quiet environments with well-enunciated speech. "Balanced" uses OpenAI's Small Model; it works exceptionally well in noisy environments and with heavy accents, providing highly accurate transcriptions for the majority of speech types. Lastly, "Precise" employs the Whisper Medium model and is our most accurate. It handles most speech types with minimal errors and particularly excels in transcribing technical speech and on-device translation. In the future, we may update the underlying AI models with newer ones or fine-tuned versions of the Whisper models as they become available.
What is the difference the between Live mode and Post-Speech mode?
Live mode continuously sends audio to the AI model for real-time processing. While this allows for immediate responses, it can be intensive on the device's battery, particularly when using the more detailed models like Balanced and Precise. Conversely, Post-Speech mode helps conserve battery life by waiting until there is a significant pause in the speech before sending the audio for processing. Not only is this lighter on the battery, but it can also result in a more accurate response, as it allows the model to analyze the speech in its entirety, rather than piece by piece.
Will there be an android version?
Although we appreciate the interest, we currently do not have plans to develop an Android version of ScribeAI. This decision is primarily due to the vast diversity of Android devices available. Ensuring consistent user experience across all of these devices poses a significant challenge. Furthermore, ScribeAI utilizes several proprietary Apple APIs, which are incompatible with the Android ecosystem. We apologize for any inconvenience this may cause.
Why doesn't the Precise model work on my phone?
The Precise model is very battery, compute, and RAM intensive. It struggles to run on older phones, particularly if other battery-intensive apps are being used concurrently. It may not work at all on some older iPhones, particularly those with 3GB of RAM or less. If it's not working for you, we would recommend running the Balanced model instead. This model will run on most devices and maintains a high degree of accuracy for most speech
Is the app making my phone hot or draining my battery?
In some circumstances, ScribeAI may cause your phone to heat up or consume more battery. This tends to occur particularly on older iPhones, and when using the Balanced or Precise models. Should you encounter this issue, we recommend firstly switching to Post-Speech mode, as this is significantly less demanding on your battery. If you've already switched to Post-Speech mode and the app continues to significantly consume your battery, we further recommend using the Rapid model. This model is significantly smaller and less power-intensive compared to the Balanced and Precise models.
Why does it only translate to English?
This is a limitation of the Whisper family of models. In the future we hope to train a new model which can translate between many different languages.
Can I separate the transcription by individual speakers?
Currently, this feature isn't available, but we hope to support it in the future.
How can I contact you?
Feel free to email us at [email protected]! ClinicalAI Ltd is a U.K. based company that has developed the ScribeAI app.