๐Ÿ’ป ๊ฐœ๋ฐœ/๐Ÿ€ Spring

์Šคํ”„๋ง ๋ถ€ํŠธ์— OpenAI Whisper API ์ ์šฉํ•˜๊ธฐ

EastShine_ 2023. 8. 24. 23:08

 

Whisper๊ฐ€ ๋ญ”๋ฐ?

 
Whisper๋Š” ์š”์ฆ˜ ChatGPT๋กœ ํ•ซํ•œ OpenAI์—์„œ ๊ฐœ๋ฐœํ•œ ์ž๋™ ์Œ์„ฑ ์ธ์‹ ๋ชจ๋ธ์ด๋‹ค. ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” STT(Speech-to-Text) ๊ธฐ์ˆ ์„ ํ™œ์šฉํ•˜์—ฌ, ๋Œ€ํ™” ์Œ์„ฑ ํŒŒ์ผ์„ ํ…์ŠคํŠธ๋กœ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋‹ค.
OpenAI๋Š” ์˜ฌํ•ด 3์›” 1์ผ GPT-3.5-turbo ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ Whisper API๋ฅผ ์ถœ์‹œํ•˜์˜€๋‹ค. API๊ฐ€ ์ œ๊ณต๋˜๊ธฐ ์ด์ „์—” Whisper๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ๊ฐ€ ๋ถˆํŽธํ–ˆ์ง€๋งŒ, ์ด์   ๊ณ ์„ฑ๋Šฅ ๋ชจ๋ธ(Large-v2)์„ ์•ฝ๊ฐ„์˜ ๊ธˆ์•ก์„ ์ง€๋ถˆํ•˜๊ณ  ์ด์šฉํ•จ์œผ๋กœ์จ ๊ฐœ๋ฐœ์ž๋“ค์ด ์‚ฌ์šฉํ•˜๊ธฐ์— ํŽธ์˜์„ฑ์ด ์ข‹์•„์กŒ๋‹ค. ๊ทธ๋ž˜์„œ ๊ฐ„๋‹จํ•˜๊ฒŒ ์Šคํ”„๋ง ๋ถ€ํŠธ๋ฅผ ํ™œ์šฉํ•ด์„œ Whisper API๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด๊ฒ ๋‹ค.
 
์šฐ์„  ๋ณธ๊ฒฉ์ ์œผ๋กœ ๋“ค์–ด๊ฐ€๊ธฐ ์ „์—, ๊ฐ„๋‹จํ•˜๊ฒŒ ๋ช‡ ๊ฐœ๋งŒ ์งš๊ณ  ๋„˜์–ด๊ฐ€ ๋ณด์ž.
๋ณดํ†ต STT ๊ธฐ์ˆ ์€ WER(Word Error Rate), CER(Character Error Rate)๋กœ ์ •ํ™•๋„๋ฅผ ํŒ๋ณ„ํ•œ๋‹ค.
์ธก์ •๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ์ •ํ™•๋„๋Š” ๋†’์€ ๊ฒƒ์ธ๋ฐ, ์•„๋ž˜ ์‚ฌ์ง„์€ Large-v2 ๋ชจ๋ธ๋กœ ์–ธ์–ด๋ณ„ WER๋ฅผ ์ธก์ •ํ•œ ๊ทธ๋ž˜ํ”„์ด๋‹ค.
 

ํ•œ๊ตญ์–ด๋Š” ๋‚˜๋ฆ„ ๊ทธ๋ž˜๋„ ์„ ๋ฐฉํ•œ ๋ชจ์Šต์ด๋‹ค. ํผํฌ๋จผ์Šค๊ฐ€ ๋งŽ์ด ์ข‹์ง„ ์•Š์ง€๋งŒ, ํ‰๊ท ๋ณด๋‹ค๋Š” ์ด์ƒ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.
๊ทธ๋ž˜๋„ ์•„๋ž˜ ๊ณต์‹ ์ด๋ฏธ์ง€์—์„œ ๋Œ€ํ‘œ ์˜ˆ์‹œ๋กœ ํ•œ๊ธ€์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค๋Š” ๊ฑด ์ข€ ๋” ์‹ ๋ขฐ๋ฅผ ๊ฐ–๊ฒŒ ํ•˜๋Š” ํฌ์ธํŠธ์˜€๋‹ค. (์ด๋ฏธ ๋‚œ ChatGPT ๋•Œ๋ฌธ์— ์—„์ฒญ๋‚œ ์‹ ๋ขฐ๋ฅผ ํ•˜๊ณ  ์žˆ๋Š” ์ค‘)
 

 
 

Whisper API๋Š” ์œ ๋ฃŒ? ๋ฌด๋ฃŒ?

 
๊ทธ๋ฆฌ๊ณ  ์œ„์—์„œ ์ž ๊น ๋งํ–ˆ์ง€๋งŒ, Whisper API๋Š” ๋ฌด๋ฃŒ๊ฐ€ ์•„๋‹ˆ๋‹ค. OpenAI์˜ API Key๋ฅผ ๋ฐœ๊ธ‰๋ฐ›์•„ ์‚ฌ์šฉ ํšŸ์ˆ˜๋งˆ๋‹ค ํฌ๋ ˆ๋”ง์„ ์ฐจ๊ฐ๋˜๊ฒŒ ๋˜๋Š”๋ฐ, Whisper์˜ ๊ฒฝ์šฐ 1๋ถ„๋‹น 0.006๋‹ฌ๋Ÿฌ, ํ•œ๊ตญ ๋ˆ์œผ๋กœ ์•ฝ 8์› ์ •๋„์ด๋‹ค. 
 

 
์ด์ œ ๋ณธ๊ฒฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.
 
 
 

Whisper API๋ฅผ ์Šคํ”„๋ง ๋ถ€ํŠธ์— ์ ์šฉํ•˜๊ธฐ

 

1. OpenAI API Key ๋ฐœ๊ธ‰

์šฐ์„  Whisper API๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ OpenAI์˜ API Key๋ฅผ ๋ฐœ๊ธ‰๋ฐ›์•„์•ผ ํ•œ๋‹ค.
 
https://platform.openai.com/account/api-keys

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

platform.openai.com

์œ„ ๊ณต์‹ ํ™ˆํŽ˜์ด์ง€ ๋งํฌ๋กœ ๋“ค์–ด๊ฐ€์„œ ๋กœ๊ทธ์ธ ํ›„, 'Create new secret key' ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด ์ƒˆ๋กœ์šด key๋ฅผ ๋ฐœ๊ธ‰๋ฐ›์„ ์ˆ˜ ์žˆ๋‹ค.
 
 

์œ„์™€ ๊ฐ™์ด ์ƒˆ๋กœ์šด ํ‚ค๊ฐ€ ๋ฐœ๊ธ‰๋˜๊ณ , ๋‹ค๋งŒ ์ฃผ์˜ํ•  ์ ์€ ํ•œ๋ฒˆ ๋ฐœ๊ธ‰๋œ ํ‚ค๋Š” ์žฌํ™•์ธ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ, ์ œ์ผ ์†Œ์ค‘ํ•œ ๊ณณ์— ์ž˜ ๊ฐ์ถฐ๋†“์•„์•ผ ํ•œ๋‹ค. (๊ทธ๋ž˜๋„ ํ‚ค๋ฅผ ์žŠ์—ˆ์„ ๊ฒฝ์šฐ์—” ๋‹ค์‹œ ์ƒ์„ฑํ•˜๋ฉด ๋˜๋Š” ๊ฒƒ ๊ฐ™๋‹ค.)
 
์ถ”๊ฐ€๋กœ, ์œ„ ํŽ˜์ด์ง€ ์ขŒ์ธก ๋ฉ”๋‰ด์˜ Usage ํƒญ์œผ๋กœ ๋“ค์–ด๊ฐ€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํฌ๋ ˆ๋”ง์ด ์žˆ๋Š”์ง€ ํ™•์ธํ•ด์•ผ ํ•œ๋‹ค. ๋‚˜ ๊ฐ™์€ ๊ฒฝ์šฐ์—” ์ฒ˜์Œ ๋ฌด๋ฃŒ๋กœ ์ œ๊ณตํ•˜๋Š” ํฌ๋ ˆ๋”ง์ด ์œ ํšจ๊ธฐ๊ฐ„์ด ์ง€๋‚˜ ์‚ฌ์šฉ์„ ๋ชปํ•˜๊ฒŒ ๋˜์–ด(์œ ํšจ๊ธฐ๊ฐ„ 3๊ฐœ์›” ๋„ˆ๋ฌด ์งง๋‹ค) 5๋‹ฌ๋Ÿฌ๋ฅผ ๊ฒฐ์ œํ•˜์˜€๋‹ค.
 
 
 

2. ํ”„๋กœ์ ํŠธ ์ƒ์„ฑ

์ด์ œ ์Šคํ”„๋ง ๋ถ€ํŠธ ํ”„๋กœ์ ํŠธ๋ฅผ ์ƒ์„ฑํ•ด ๋ณด์ž. 
๊ฐ„๋žตํ•˜๊ฒŒ ์ฝ”๋“œ๋ฅผ ์„ค๋ช…ํ•˜๋ ค ํ•œ๋‹ค. ์ฝ”๋“œ ์ „์ฒด๋Š” ๊ธ€ ๋งจ ์•„๋ž˜ ๊นƒํ—ˆ๋ธŒ ๋งํฌ์—์„œ ํ™•์ธ ๊ฐ€๋Šฅํ•˜๋‹ค.
์Šคํ”„๋ง ๋ถ€ํŠธ 2.7.8, java 11์„ ์‚ฌ์šฉํ–ˆ๊ณ , Spring cloud์˜ OpenFeign์„ ์‚ฌ์šฉํ•˜์—ฌ Whisper API ํ˜ธ์ถœ์„ ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค.
์•„๋ž˜๋Š” build.gradle ์ฝ”๋“œ์ด๋‹ค.
 

// build.gradle

plugins {
    id 'java'
    id 'org.springframework.boot' version '2.7.8'
    id 'io.spring.dependency-management' version '1.0.15.RELEASE'
}

group = 'com.eastshine'
version = '0.0.1-SNAPSHOT'

java {
    sourceCompatibility = '11'
}

configurations {
    compileOnly {
        extendsFrom annotationProcessor
    }
}

repositories {
    mavenCentral()
}

ext {
    set('springCloudVersion', "2021.0.3")
}

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-web'
    compileOnly 'org.projectlombok:lombok'
    developmentOnly 'org.springframework.boot:spring-boot-devtools'
    annotationProcessor 'org.projectlombok:lombok'
    testImplementation 'org.springframework.boot:spring-boot-starter-test'
    implementation 'org.springframework.cloud:spring-cloud-starter-openfeign'
}

dependencyManagement {

    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
    }

}

tasks.named('test') {
    useJUnitPlatform()
}

 
 
์šฐ์„  application.yml์— ์•„๋ž˜์™€ ๊ฐ™์ด OpenAI Config๋ฅผ ์ž‘์„ฑํ•ด์•ผ ํ•œ๋‹ค.
api-key์—๋Š” ์œ„์—์„œ ๋ฐœ๊ธ‰๋ฐ›์€ key๊ฐ’์„ ๋„ฃ์–ด์ฃผ๋„๋ก ํ•œ๋‹ค.
 

// application.yml

openai-service:
  api-key: (OpenAI์—์„œ ๋ฐœ๊ธ‰๋ฐ›์€ Key ์ž‘์„ฑ)
  gpt-model: gpt-3.5-turbo
  audio-model: whisper-1
  http-client:
    read-timeout: 3000
    connect-timeout: 3000
  urls:
    base-url: https://api.openai.com/v1
    chat-url: /chat/completions
    create-transcription-url: /audio/transcriptions

 
 
API ํ˜ธ์ถœ์— ํ•ต์‹ฌ์ด ๋˜๋Š” ๊ณณ์€ OpenFeign์„ ํ™œ์šฉํ•˜์—ฌ ์ธํ„ฐํŽ˜์ด์Šค๋กœ ์ž‘์„ฑํ–ˆ๋‹ค.

// OpenAIClient.java

@FeignClient(
        name = "openai-service",
        url = "${openai-service.urls.base-url}",
        configuration = OpenAIClientConfig.class
)
public interface OpenAIClient {

    @PostMapping(value = "${openai-service.urls.create-transcription-url}", headers = {"Content-Type=multipart/form-data"})
    WhisperTranscriptionResponse createTranscription(@ModelAttribute WhisperTranscriptionRequest whisperTranscriptionRequest);

}

 
 
Service ๋ ˆ์ด์–ด์—์„œ API์„ ํ˜ธ์ถœํ•˜์—ฌ AudioํŒŒ์ผ์„ ์š”์ฒญ ๋ณด๋‚ด๋ฉด Text๋กœ ๋ฐ˜ํ™˜ํ•ด ์ฃผ๋„๋ก ํ•œ๋‹ค.

// OpenAIClientService.java

@RequiredArgsConstructor
@Service
public class OpenAIClientService {

    private final OpenAIClient openAIClient;
    private final OpenAIClientConfig openAIClientConfig;

    public WhisperTranscriptionResponse createTranscription(TranscriptionRequest transcriptionRequest){
        WhisperTranscriptionRequest whisperTranscriptionRequest = WhisperTranscriptionRequest.builder()
                .model(openAIClientConfig.getAudioModel())
                .file(transcriptionRequest.getFile())
                .build();
        return openAIClient.createTranscription(whisperTranscriptionRequest);
    }

}

 
Controller ๋ ˆ์ด์–ด์—์„œ๋Š” REST API๋กœ ๊ตฌํ˜„ํ–ˆ๋‹ค.

@RequiredArgsConstructor
@RestController
@RequestMapping(value = "/api")
public class WhisperController {

    private final OpenAIClientService openAIClientService;

    @PostMapping(value = "/transcription", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
    public WhisperTranscriptionResponse createTranscription(@ModelAttribute TranscriptionRequest transcriptionRequest){
        return openAIClientService.createTranscription(transcriptionRequest);
    }

}

 
 
 

3. ์Œ์„ฑ ์ƒ˜ํ”Œ

์ž, ์ด์ œ ๊ตฌํ˜„์„ ๋‹คํ–ˆ๋‹ค๋ฉด ์‹ค์ œ ์ž‘๋™ํ•˜๋Š”์ง€ ์•Œ์•„๋ณด์ž.
๋‚ด๊ฐ€ ์ง์ ‘ ๋…น์Œํ•ด์„œ ๋„ฃ์–ด๋ด๋„ ๋˜์ง€๋งŒ, ์ข€ ๋” ์‹ค์ œ์— ๊ฐ€๊นŒ์šด ์Œ์„ฑ์„ ์ฐพ๊ธฐ ์œ„ํ•ด ์ธํ„ฐ๋„ท์„ ๊ฒ€์ƒ‰ํ•˜๋˜ ์ค‘, ai hub๋ผ๋Š” ๊ณณ์—์„œ ์•„์ฃผ ๋ฐฉ๋Œ€ํ•œ ์–‘์˜ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์ฐพ์•˜๋‹ค. ํšŒ์› ๊ฐ€์ž…์„ ํ•˜๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ๋ฌด๋ฃŒ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ–ˆ๋‹ค. (๋ฌผ๋ก  ์ƒ์—…์ ์ธ ์šฉ๋„ ์ด์™ธ์—๋งŒ)
 
https://www.aihub.or.kr/

AI-Hub

์ž์„ธํžˆ๋ณด๊ธฐ AI ํ—ˆ๋ธŒ๊ฐ€ ์ถ”์ฒœํ•˜๋Š” ๊ฒ€์ƒ‰์–ด์ž…๋‹ˆ๋‹ค. ํƒœ๊ทธ๋ฅผ ํด๋ฆญํ•˜์—ฌ ๊ฒ€์ƒ‰๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

www.aihub.or.kr

 
๋‚˜๋Š” Whisper์˜ ์„ฑ๋Šฅ์ด ๊ถ๊ธˆํ•˜๊ธฐ๋„ ํ•ด์„œ ๊นจ๋—ํ•œ ์Œ์งˆ์ด ์•„๋‹Œ ์ €์Œ์งˆ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด๋ณด๊ณ  ์‹ถ์—ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์•„๋ž˜ '์ €์Œ์งˆ ์ „ํ™”๋ง ์Œ์„ฑ์ธ์‹ ๋ฐ์ดํ„ฐ'๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜์˜€๋‹ค. ์ด๊ฒƒ๋งŒ ์šฉ๋Ÿ‰์ด ๋ฌด๋ ค 236GB๋‚˜ ๋ผ์„œ ๊ฐ๋‹น์ด ์•ˆ๋˜๊ธฐ์— ์•„์ฃผ ์ผ๋ถ€๋งŒ ๋ฐ›์•„์„œ ์ง„ํ–‰ํ–ˆ๋‹ค.

 
 
 

4. ํฌ์ŠคํŠธ๋งจ(Postman)์œผ๋กœ ์Œ์„ฑํŒŒ์ผ ์ „์†ก

์œ„ ai hub์—์„œ ๋ฐ›์€ ์Œ์„ฑํŒŒ์ผ๋“ค์€ ์‹ค์ œ ์ƒ๋‹ด์‚ฌ์™€ ๊ณ ๊ฐ์˜ ๋ชฉ์†Œ๋ฆฌ๊ฐ€ ๋‹ด๊ฒจ์žˆ์—ˆ๋‹ค.
๊ทธ์ค‘ ํ•˜๋‚˜๋ฅผ ์ž„์˜๋กœ ์„ ํƒํ–ˆ๊ณ , ๊ทธ ์Œ์„ฑํŒŒ์ผ์˜ ๋‚ด์šฉ์€ ์ด๋žฌ๋‹ค.

''์•„ ๊ทธ๋Ÿฌ๋ฉด ์–ด ์ €ํฌ๊ฐ€ ์ง€๊ธˆ ์ปจ์„คํŒ… ์˜ˆ์•ฝ ๊ฐ€๋Šฅํ•œ ๋‚จ์€ ์Šค์ผ€์ค„์ด ์ด๋ฒˆ ์ฃผ ๊ธˆ์š”์ผ ๋˜๋Š” ๋‹ค์Œ ์ฃผ ์ค‘ (๋ฒ„๋ฒ…) ๋‹ค์Œ ์ฃผ์ค‘์œผ๋กœ ์Šค์ผ€์ค„์ด ๋‚จ์•„ ์žˆ์œผ์‹œ๊ฑฐ๋“ ์š”. ํ˜น์‹œ ๊ดœ์ฐฎ์œผ์‹  ๋‚ ์งœ๊ฐ€ ์žˆ์œผ์‹ค๊นŒ์š”?"

์ƒ๋‹ด์‚ฌ๊ฐ€ ์œ„์ฒ˜๋Ÿผ ๋งํ•œ ๋‚ด์šฉ์ด ๋‹ด๊ฒจ์žˆ์—ˆ๋‹ค. ์ด์ œ ์ด ํŒŒ์ผ์„ ํฌ์ŠคํŠธ๋งจ์œผ๋กœ ์ „์†กํ•ด ๋ณผ ์ฐจ๋ก€๋‹ค.
 

 
์œ„์™€ ๊ฐ™์ด ์Œ์„ฑํŒŒ์ผ๊ณผ ๋ชจ๋ธ๋ช…์„ ์š”์ฒญ์œผ๋กœ ๋ณด๋‚ด์ฃผ์—ˆ๋”๋‹ˆ text์œผ๋กœ ์Œ์„ฑ ํŒŒ์ผ ๋‚ด์šฉ์„ ์‘๋‹ตํ•ด ์ฃผ์—ˆ๋‹ค..! ์Œ์งˆ์ด ์•ˆ ์ข‹์€๋ฐ๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ๋‚˜๋ฆ„ ์ž˜ ๋ณ€ํ™˜ํ•œ ๊ฒƒ ๊ฐ™์•˜๋‹ค. (์ปจ์„คํŒ…์„ ์ฝ˜์„œํŠธ๋ผ๊ณ  ์ž…๋ ฅ๋œ ๊ฑฐ ๋นผ๊ณ ๋Š”)
 
 

๋งˆ์น˜๋ฉฐ

STT ๊ธฐ์ˆ ์„ ์ง€์›ํ•˜๋Š” API๋Š” Whisper ๋ง๊ณ ๋„ ๋งŽ์ด ์กด์žฌํ•˜์ง€๋งŒ, ChatGPT์˜ ์•„์ฃผ ํŒŒ์›Œํ’€ํ•œ ์„ฑ๋Šฅ์„ ์ž˜ ์“ฐ๊ณ  ์žˆ๊ณ , ์ž˜ ์•Œ๊ธฐ ๋•Œ๋ฌธ์— Whisper์˜ ์„ฑ๋Šฅ๋„ ๊ถ๊ธˆํ–ˆ์—ˆ๋‹ค. ๋ช‡ ๋ฒˆ ๋Œ๋ ค๋ณด์ง„ ๋ชปํ–ˆ์ง€๋งŒ ๋‚˜๋ฆ„ ํ›Œ๋ฅญํ•˜๊ฒŒ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ์„ ๋ณด๊ณ  ๋ˆ์ด ๋งŽ์•˜์œผ๋ฉด ์ƒ๊ฐ ์—†์ด ๊ณ„์† ๋Œ๋ ค๋ณด๋ฉด์„œ ํ•œ๊ตญ์–ด WER๋„ ์ง์ ‘ ์ธก์ •ํ•ด๋ณด๊ณ  ์‹ถ์ง€๋งŒ ๊ทธ๊ฑด ๋‹ค์Œ ๊ธฐํšŒ๋กœ..
 
 
 
 
 
์ฐธ๊ณ 
 
https://openai.com/research/whisper
https://betterprogramming.pub/integrating-chatgpt-and-whisper-apis-into-spring-boot-microservice-5545e2ea44fc
 
 
 
 
์ฝ”๋“œ ์ „๋ฌธ Github ๋งํฌ
 
https://github.com/eastshine12/whisper-api

GitHub - eastshine12/whisper-api: Spring Boot์— OpenAI Whisper API ์ ์šฉ

Spring Boot์— OpenAI Whisper API ์ ์šฉ. Contribute to eastshine12/whisper-api development by creating an account on GitHub.

github.com