ChatGPT raises a debate over how human learn language
题材:科普类
出处:The Economist《经济学人》
字数:739 words
[1] When deep blue, a chess computer, defeated Garry Kasparov, a world champion, in 1997 many gasped in fear of machines triumphing over mankind. In the intervening years, artificial intelligence has done some astonishing things, but none has managed to capture the public imagination in quite the same way. Now, though, the astonishment of the Deep Blue moment is back, because computers are employing something that humans consider their defining ability: language.
【1997年,当国际象棋计算机深蓝击败世界冠军加里·卡斯帕罗夫时,许多人因害怕机器战胜人类而倒吸一口冷气。在这几年里,人工智能做了一些令人惊讶的事情,但没有一件能以完全相同的方式引发公众的想象力。然而,现在,深蓝时刻又回来了,因为计算机正在使用人类认为自己具有定义能力的东西:语言。】
【重点词汇】
gasp /ɡɑːsp/ v. (尤指惊讶或疼痛时的)倒吸气
triumph /ˈtraɪʌmf/ v. 打败 n. 巨大成功
intervening /ˌɪntəˈviːnɪŋ/ adj. 发生于其间的
astonishing /əˈstɒnɪʃɪŋ/ adj. 令人十分惊讶的
[2] Or are they? Certainly, large language models (LLMs), of which the most famous is ChatGPT, produce what looks like impeccable human writing. But a debate has ensued about what the machines are actually doing internally, what it is that humans, in turn, do when they speak—and, inside the academy, about the theories of the world’s most famous linguist, Noam Chomsky.
【真的是吗?当然,大型语言模型 (LLMs),其中最著名的是ChatGPT,可以产生看起来无可挑剔的人类写作。但是随之而来的争论是,机器内部到底是怎么运作的,反过来,人类在说话时人体内部又在做什么,而在学术界,争论的焦点是世界上最著名的语言学家诺姆·乔姆斯基的理论。】
【重点词汇】
impeccable /ɪmˈpekəb(ə)l/ adj. 无可挑剔的
internally /ɪnˈtɜːnəli/ adv. 在内部
【长难句分析】
Certainly, large language models (LLMs), of which the most famous is ChatGPT, produce what looks like impeccable human writing.
【结构分析】
主句:large language models (LLMs) produce
定语从句:of which the most famous is ChatGPT
宾语从句:what looks like impeccable human writing
[3] Although Professor Chomsky’s ideas have changed considerably since he rose to prominence in the 1950s, several elements have remained fairly constant. He and his followers argue that human language is different in kind (not just degree of expressiveness) from all other kinds of communication. All human languages are more similar to each other than they are to, say, whale song or computer code. Professor Chomsky has frequently said a Martian visitor would conclude that all humans speak the same language, with surface variation.
【尽管乔姆斯基教授自20世纪50年代成名以来,他的思想发生了很大的变化,但有几个方面却在很大程度上保持不变。他和他的追随者认为,人类语言与所有其他形式的交流在种类上(不仅仅是表达程度)是不同的。所有的人类语言彼此之间的相似性比鲸鱼的歌声或计算机代码更大。乔姆斯基教授经常说,火星访客会得出结论,即所有人都说同一种语言,只是表面上有所不同。】
【重点词汇】
considerably /kənˈsɪdərəbli/ adv. 相当多地
prominence /ˈprɒmɪnəns/ n. 出名
fairly /ˈfeəli/ adj. 在很大程度上
degree /dɪˈɡriː/ n. 程度
frequently /ˈfriːkwəntli/ adv. 频繁地
variation /ˌveəriˈeɪʃ(ə)n/ n. 变化
[4] Perhaps most notably, Chomskyan theories hold that children learn their native languages with astonishing speed and ease despite “the poverty of the stimulus”: the sloppy and occasional language they hear in childhood. The only explanation for this can be that some kind of predisposition for language is built into the human brain.
【也许最值得注意的是,乔姆斯基的理论认为,尽管“缺乏刺激”: 他们在童年时听到的是草率和偶然的语言,但儿童学习母语的速度和轻松程度是惊人的。对此的唯一解释是,人类大脑中存在某种语言倾向。】
【重点词汇】
stimulus /ˈstɪmjələs/ n. 刺激
sloppy /ˈslɒpi/ adj. 草率的
occasional /əˈkeɪʒən(ə)l/ adj. 偶然的
predisposition /ˌpriːdɪspəˈzɪʃ(ə)n/ n. 倾向
[5] Chomskyan ideas have dominated the linguistic field of syntax since their birth. But many linguists are strident anti-Chomskyans. And some are now seizing on the capacities of LLMs to attack Chomskyan theories anew.
【乔姆斯基的思想自诞生之日起就占据了语言学领域的主导地位。但许多语言学家都是强硬的反乔姆斯基主义者。一些人现在正利用LLMs的能力,重新攻击乔姆斯基的理论。】
【重点词汇】
dominate /ˈdɒmɪneɪt/ v. 占据主导地位
strident /ˈstraɪd(ə)nt/ adj. 强硬的
seize on 利用
[6] Grammar has a hierarchical, nested structure involving units within other units. Words form phrases, which form clauses, which form sentences and so on. Chomskyan theory posits a mental operation, “Merge”, which glues smaller units together to form larger ones that can then be operated on further (and so on). In a recent New York Times op-ed, the man himself (now 94) and two co-authors said “we know” that computers do not think or use language as humans do, referring implicitly to this kind of cognition. LLMs, in effect, merely predict the next word in a string of words.
【语法具有层次结构,嵌套结构,涉及单元之间的单元。单词组成短语,短语组成分句,分句组成句子,以此类推。乔姆斯基的理论假定了一种心理操作,即“合并”,它将较小的单元粘合在一起,形成更大的单元,然后可以进一步操作(以此类推)。在《纽约时报》最近的一篇专栏文章中,他本人(现年94岁)和两位合著者表示,“我们知道”计算机不像人类那样思考或使用语言,暗指这种认知。LLMs实际上只是预测单词串中的下一个单词。】
【重点词汇】
hierarchical /ˌhaɪəˈrɑːkɪk(ə)l/ adj. 按等级划分的
posit /ˈpɒzɪt/ v. 假设
implicitly /ɪmˈplɪsɪtli/ 含蓄地
【7】Yet it is hard, for several reasons, to fathom what LLMs “think”. Details of the programming and training data of commercial ones like ChatGPT are proprietary. And not even the programmers know exactly what is going on inside.
【然而,由于一些原因,很难理解LLMs在“思考”什么。像ChatGPT这样的商业软件的编程细节和训练数据是专利的。甚至连程序员都不知道里面到底发生了什么。】
【重点词汇】
fathom /ˈfæðəm/ v. 理解
proprietary /prəˈpraɪət(ə)ri/ adj. 专利的
【8】Linguists have, however, found clever ways to test LLMs’ underlying knowledge, in effect tricking them with probing tests. And indeed, LLMs seem to learn nested, hierarchical grammatical structures, even though they are exposed to only linear input, ie, strings of text. They can handle novel words and grasp parts of speech. Tell ChatGPT that “dax” is a verb meaning to eat a slice of pizza by folding it, and the system deploys it easily: “After a long day at work, I like to relax and dax on a slice of pizza while watching my favourite TV show.” (The imitative element can be seen in “dax on”, which ChatGPT probably patterned on the likes of “chew on” or “munch on”.)
【然而,语言学家已经找到了一些聪明的方法来测试LLMs的基础知识,实际上是用探查性的测试来欺骗他们。事实上,LLMs似乎可以学习嵌套的、分层的语法结构,即使他们只接触线性输入,即文本字符串。他们能处理新单词,掌握部分词性。告诉ChatGPT,“dax”是一个动词,意思是把一片披萨折叠起来吃,系统很容易就能把它应用起来:“在漫长的一天工作之后,我喜欢放松一下,一边看我最喜欢的电视节目,一边咀嚼披萨。”(模仿元素可以在“dax on”中看到,ChatGPT可能模仿了“chew on”或“munch on”之类的单词。)】
【重点词汇】
underlying /ˌʌndəˈlaɪɪŋ/ adj. 基础的
probing /ˈprəʊbɪŋ/ adj. 探查性的
deploy /dɪˈplɔɪ/ v. 有效地利用
imitative /ˈɪmɪtətɪv/ adj. 模仿的
【9】What about the “poverty of the stimulus”? After all, GPT-3 (the LLM underlying ChatGPT until the recent release of GPT-4) is estimated to be trained on about 1,000 times the data a human ten-year-old is exposed to. That leaves open the possibility that children have an inborn tendency to grammar, making them far more proficient than any LLM. In a forthcoming paper in Linguistic Inquiry, researchers claim to have trained an LLM on no more text than a human child is exposed to, finding that it can use even rare bits of grammar. But other researchers have tried to train an LLM on a database of only child-directed language (that is, of transcripts of carers speaking to children). Here LLMs fare far worse. Perhaps the brain really is built for language, as Professor Chomsky says.
【那么“缺乏刺激”呢? 毕竟,据估计,GPT-3(在最近发布GPT-4之前,基于LLM的ChatGPT)所接受的训练数据大约是10岁儿童所接触数据的1000倍。这就留下了一种可能性,即孩子们天生就有语法倾向,这使得他们比任何LLM都要精通得多。在即将发表在《语言学探究》(Linguistic Inquiry)上的一篇论文中,研究人员声称,他们训练的LLM学习的文本并不比人类儿童接触的文本多,他们发现,LLM甚至可以使用一些罕见的语法。但其他研究人员已经尝试在一个仅针对儿童的语言数据库(即看护人与儿童交谈的文字记录)上训练LLM。在这里,LLMs的处境要糟糕得多。也许正如乔姆斯基教授所说,大脑真的是为语言而生的。】
【重点词汇】
proficient /prəˈfɪʃ(ə)nt/ adj. 精通的
forthcoming /ˌfɔːθˈkʌmɪŋ/ adj. 即将发生的
rare /reə(r)/ adj. 罕见的
transcript /ˈtrænskrɪpt/ n. (根据录音或笔记整理的)文字本
【10】It is difficult to judge. Both sides of the argument are marshalling LLMs to make their case. The eponymous founder of his school of linguistics has offered only a brusque riposte. For his theories to survive this challenge, his camp will have to put up a stronger defence.
【这很难判断。争论的双方都在召集LLMs来证明自己的观点。他的语言学学派的同名创始人只给出了一个无礼的反驳。为了让他的理论经受住挑战,他的阵营必须建立更强大的防御。】
【重点词汇】
marshal /ˈmɑːʃ(ə)l/ v. 召集
eponymous /ɪˈpɒnɪməs/ adj. 同名的
brusque /bruːsk/ adj. 无礼的
riposte /rɪˈpɒst/ n. 巧妙的反驳
put up 建造
报名咨询电话:0769-33377791、13265205288(vx同号)
更多相关内容可关注微信公众号:恒景研考教育、恒景教育
未经允许不得转载:广东学历教育网 » 备考分享|考研管理类联考英语二外刊选读(15)