Whisper is a speech recognition system, made available under an open software license, which has been trained at 680,000 hours of multilingual and multi -purpose collected from the Internet. It has been designed to be resistant to accents, background noise and technical language, and also enables transcription and translation of speech in many languages ​​into English. This is a simple approach like „from zero to end” implemented in the form of a transformer-type code code. In addition, it can identify the language and create time tags at the phrase level. It was created in such a way that it was easy to use and has high accuracy, which allows programmers to add voice interfaces to even more applications.