GPT-2 Output Detector

åœç€Ÿã®GPT-2ãã£ãã¯ã¿ãŒãéžã°ããçç±
çµ±èšç粟å¯ã
RoBERTaããŒã¹ã®ã¢ãã«ã掻çšããããŒã¯ã³ã®ç¢ºçååžãåæãGPT-2ã®ãµã³ããªã³ã°ææ³ãæ®ãç¬èªã®ãæçŽããç¹å®ããŸãã
æ§ã¢ãã«ãžã®å°éæ§
ææ°ã®æ€åºããŒã«ãGPT-4ã«æ³šåããäžæ¹ã§ãæ¬ããŒã«ã¯15åãã©ã¡ãŒã¿ã®GPT-2ã¢ãã«ã«ç¹åãæ±çšããŒã«ãèŠéããã¡ãªçްããªãã¥ã¢ã³ã¹ãæããŸãã
ããŒãã¬ãã·ãã£ã»ã¹ã³ã¢ãªã³ã°
ããã¹ãã®ãã©ã³ãã æ§ããæž¬å®ããŸããGPT-2ãçæããã¡ãªã人éãæžãã«ã¯çµ±èšçã«äžèªç¶ãªäœããŒãã¬ãã·ãã£ã®é åããã©ã°ç«ãŠããŸãã
ãŒãã·ã§ããè§£æ
äºåã³ã³ããã¹ãã¯äžèŠã§ããæ§ã ãªTemperatureèšå®ãTop-K/Top-Pãµã³ããªã³ã°èšå®ã«ãããGPT-2ã®çåºåãè©äŸ¡ããŸãã
ç ç©¶ã¬ãã«ã®ãã©ã€ãã·ãŒä¿è·
ç ç©¶è ãéçºè åãã«èšèšãããŠããŸããããŒã¿ã»ããã¯æå·ååŠçãããéä¿¡ãããæååãåŠç¿ã«äœ¿çšããããä¿åããããããããšã¯ãããŸããã
確çããŒãããã
ååèªã®åºçŸç¢ºçãå¯èŠåãGPT-2ã¢ãã«ãé«ã確信床ã§äºæž¬ããããŒã¯ã³ããã€ã©ã€ãããAIç±æ¥ã§ããããšã瀺ããŸãã

GPT-2ã«ç¹åãããã©ã¬ã³ãžãã¯è§£æ
ãªãªãžãã«ã®GPT-2åºåããŒã¿ã»ããã§åŠç¿ããå°çšåé¡åšãæ¡çšãåæã®ãã©ã³ã¹ãã©ãŒããŒã¢ãã«ç¹æã®æ§æãèšèªããŒã«ãŒãåæããã³ã³ãã³ãã®çåœãæç¢ºã«å€å®ããŸãã

詳现ãªç¢ºçå èš³
ã人éïŒRealïŒå¯Ÿ AIïŒFakeïŒãã®ç¢ºçã¹ã³ã¢ãå«ãå æ¬çãªã¬ããŒããæäŸãããã¹ããã»ã°ã¡ã³ãããšã«åè§£ããGPT-2ã®çæãã¿ãŒã³ãã©ãã§æãé¡èããç¹å®ããŸãã

ãã¹ãŠã®GPT-2ããªã¢ã³ãã«å¯Ÿå¿
SmallãMediumãLargeããããŠãã«ãµã€ãºã®1.5BïŒExtra LargeïŒãŸã§ãããããGPT-2ã¢ãã«ã§çæãããããã¹ããé«ãæåºŠã§æ€åºã§ãããã調æŽãããŠããŸãã
GPT-2ã³ã³ãã³ãã®æ€èšŒæ¹æ³

ããã¹ãã®è²Œãä»ã
GPT-2ã«ããçæãçãããããã¹ããã³ããŒããå®å šãªè§£æãã£ãŒã«ãã«è²Œãä»ããŸããäžæ¬åŠççšã®.txtãã¡ã€ã«ã«ã察å¿ããŠããŸãã

çµ±èšã¹ãã£ã³ã®å®è¡
ãè§£æããã¿ã³ãã¯ãªãã¯ããŠRoBERTaããŒã¹ã®åé¡åšãèµ·åãã·ã¹ãã ãæ¢ç¥ã®GPT-2åºåãã¿ãŒã³ãšããŒã¯ã³ååžãç §åããŸãã

ã¹ã³ã¢ã®ç¢ºèª
æçµçãªããŒã»ã³ããŒãžã確èªããŸãããFakeãã¹ã³ã¢ãé«ãã»ã©ããã®ããã¹ããGPT-2èšèªã¢ãã«ã®äºæž¬å¯èœãªçµ±èšççµè·¯ã蟿ã£ãŠããããšã瀺ããŸãã
ããã¹ãã®è²Œãä»ã
GPT-2ã«ããçæãçãããããã¹ããã³ããŒããå®å šãªè§£æãã£ãŒã«ãã«è²Œãä»ããŸããäžæ¬åŠççšã®.txtãã¡ã€ã«ã«ã察å¿ããŠããŸãã
çµ±èšã¹ãã£ã³ã®å®è¡
ãè§£æããã¿ã³ãã¯ãªãã¯ããŠRoBERTaããŒã¹ã®åé¡åšãèµ·åãã·ã¹ãã ãæ¢ç¥ã®GPT-2åºåãã¿ãŒã³ãšããŒã¯ã³ååžãç §åããŸãã
ã¹ã³ã¢ã®ç¢ºèª
æçµçãªããŒã»ã³ããŒãžã確èªããŸãããFakeãã¹ã³ã¢ãé«ãã»ã©ããã®ããã¹ããGPT-2èšèªã¢ãã«ã®äºæž¬å¯èœãªçµ±èšççµè·¯ã蟿ã£ãŠããããšã瀺ããŸãã
ãã¯ãã«ã«ã»ãªãŒãã£ããã«æé©

AIç ç©¶è ã®æ¹ãž
ããŒã¿ã»ããã®åŠ¥åœæ§ãæ€èšŒãã人éãæžããå¯Ÿç §çŸ€ã«å¯Ÿããåæèšèªã¢ãã«ã®ãæ€åºå¯èœæ§ãããã³ãããŒã¯æž¬å®ã§ããŸãã
ããŒã¿ã»ããã®åŠ¥åœæ§ãæ€èšŒãã人éãæžããå¯Ÿç §çŸ€ã«å¯Ÿããåæèšèªã¢ãã«ã®ãæ€åºå¯èœæ§ãããã³ãããŒã¯æž¬å®ã§ããŸãã

ã¢ãŒã«ã€ãæ€èšŒã«
2019幎ãã2021幎ã«ãããŠã®å€ããŠã§ãã¢ãŒã«ã€ããããŒã¿ã»ãããç£æ»ããåæã«æµå ¥ããGPT-2çæã®ã¹ãã ããããã³ã³ãã³ããç¹å®ããŸãã
2019幎ãã2021幎ã«ãããŠã®å€ããŠã§ãã¢ãŒã«ã€ããããŒã¿ã»ãããç£æ»ããåæã«æµå ¥ããGPT-2çæã®ã¹ãã ããããã³ã³ãã³ããç¹å®ããŸãã

NLPéçºè ã®æ¹ãž
ç¬èªã«ãã¡ã€ã³ãã¥ãŒãã³ã°ããGPT-2ã¢ãã«ããã¹ããã«ã¹ã¿ã åºåã人éã®æç« ãšåºå¥ãã€ããªãã¬ãã«ã«éããŠãããã確èªã§ããŸãã
ç¬èªã«ãã¡ã€ã³ãã¥ãŒãã³ã°ããGPT-2ã¢ãã«ããã¹ããã«ã¹ã¿ã åºåã人éã®æç« ãšåºå¥ãã€ããªãã¬ãã«ã«éããŠãããã確èªã§ããŸãã

ãµã€ããŒã»ãã¥ãªãã£ããŒã ã«
äœã³ã¹ãã§å€§éã®ããã¹ãçæãå¯èœãªGPT-2ãæªçšãããèªåçæã®ããã§ã€ã¯ãã¥ãŒã¹ããSNSããããã£ã³ããŒã³ãç¹å®ããŸãã
äœã³ã¹ãã§å€§éã®ããã¹ãçæãå¯èœãªGPT-2ãæªçšãããèªåçæã®ããã§ã€ã¯ãã¥ãŒã¹ããSNSããããã£ã³ããŒã³ãç¹å®ããŸãã
ãã®ããŒã«ã¯ã©ã®ãããªæ¹ã«æé©ã

ããŒã¿ãµã€ãšã³ãã£ã¹ã
ã¢ãã«ã®åŽ©å£ãå質äœäžãæãæãã®ããGPT-2çæããã¹ããé€å»ããåŠç¿ããŒã¿ãã¯ãªãŒã³ã«ä¿ã¡ãŸãã

åŠè¡ç ç©¶è
AIã©ã€ãã£ã³ã°ã®é²åãç ç©¶ã人éã®æç« ãšãåæã®ãã©ã³ã¹ãã©ãŒããŒããŒã¹ã®çææãæ£ç¢ºã«åºå¥ããããã«æŽ»çšã§ããŸãã

ãã©ã¬ã³ãžãã¯èšèªåŠè
ããžã¿ã«ææžã®åºæãæ©æ¢°çæã§ãããšçãããæ³çã»èª¿æ»çã±ãŒã¹ã«ãããŠãå®éçææ³ãé©çšã§ããŸãã

ã³ã³ãã³ãã¢ãã¬ãŒã¿ãŒ
é床éèŠã§GPT-2ã¢ãŒããã¯ãã£ã䜿ãç¶ããã¬ã¬ã·ãŒãªã¹ã¯ãªããã«ãããèªåæçš¿ã³ã¡ã³ãããã©ãŒã©ã æçš¿ãæ€ç¥ããŸãã

ãã¡ã¯ããã§ãã«ãŒ
æ¡æ£ãããããªãŒã¯æ å ±ããææžããå®éã«ã¯GPT-2ã«ãããã«ã·ããŒã·ã§ã³ïŒæé ïŒã§ã¯ãªãããè¿ éã«å€æããŸãã

ãœãããŠã§ã¢ãšã³ãžãã¢
APIãã¯ãŒã¯ãããŒã«çµ±åãããŠãŒã¶ãŒãæçš¿ããäœå質ãªGPT-2çæããã¹ããèªåçã«ã¹ã¯ãªãŒãã³ã°ããŸãã
å°éå®¶ããã®ãã£ãŒãããã¯
GPT-2æ€åºã«é¢ãããããã質å
GPT-2ã®ç¹å®ã«é¢ããæè¡çãªã質åã§ããïŒãšã³ãžãã¢ãªã³ã°ããŒã ã詳现ãåçããŸãã
äžéšã®ãã¿ãŒã³ãæããå¯èœæ§ã¯ãããŸãããæ¬ããŒã«ã¯GPT-2ã«æé©åãããŠããŸããããæ°ããã¢ãã«ã«ã¯ãRLHFãã¥ãŒãã³ã°ã«å¯Ÿå¿ããåœç€Ÿã®ãUniversal AI Detectorãã®äœ¿çšããå§ãããŸãã
ãã®åèªã®äžŠã³ãGPT-2ã¢ãã«ã«ãã£ãŠäºæž¬ããã確çã«åºã¥ããŠããŸãããFakeãã¹ã³ã¢ã99%ã®å Žåããã®ããã¹ããGPT-2ã®çµ±èšçåºåãšå®å šã«äžèŽããŠããããšãæå³ããŸãã
ã¯ããå»çãæ³åãªã©ã®ç¹å®ããŒã¿ã§åŠç¿ãããã¢ãã«ã§ãã£ãŠããåºç€ãšãªããã©ã³ã¹ãã©ãŒããŒã»ã¢ãŒããã¯ãã£ã«ã¯æ€åºå¯èœãªçµ±èšççè·¡ãæ®ããŸãã
10èªæªæºã®çãæç« ã¯çµ±èšè§£æã®ããã®ããŒã¿ãã€ã³ããå°ãªããã°ãã€ããçããããããã§ããæå€§éã®ç²ŸåºŠãåŸãã«ã¯ã50èªä»¥äžã®æç« ã§ã®è§£æãæšå¥šããŸãã






