
2024-12-03 11:51:49

Computers are learning to read your mind—through your facial expressions(2)

Current technology is pretty good at recognizing six basic emotions——fear, anger, joy, surprise, disgust and sadness——says Jeffrey Cohn, a psychology professor at the University of Pittsburgh. But there are thousands of combinations and variations, he says. For instance, there are different kinds of disgust——to physical stimuli, to moral stimuli and so on.

Cohn and his colleagues have defined 40 facial “action units”——the smallest visually distinguishable changes in facial appearance——and have compiled a database of 210 people, with 10 images of each, illustrating different combinations of action units.

Cohn says he hopes to see the technology refined so it can be used reliably to diagnose people with mental disorders and assess the efficacy of treatment. He says it might also be used as an adjunct in lie-detector tests and in security systems that attempt to identify people by their faces.

Meanwhile, IBM is working on computer recognition of emotional expressions at the Almaden Research Center in San Jose. Through its Blue Eyes project, it’s developing algorithms for “affect detection” based on the position of the eyebrows and mouth corners.

IBM is also trying to perfect an “emotion mouse” that will determine users’emotional states by measuring pulse, temperature, general somatic activity and galvanic skin response. The company has mapped those measurements for anger, fear, sadness, disgust, happiness and surprise.

The idea is to have the computer adopt a working style that fits a user’s personality.It might,for example,offer to present a different kind of display if it senses that the user is frustrated.

IBM says computers would be much more powerful if they had a small fraction of the perceptual ability of animals or humans. The aim of Blue Eyes is to enable humans and computer to work together as partners.




据匹兹堡大学的心理学教授Jeffrey Cohn称,目前的技术在识别六种基本的感情时做得很好,即害怕、生气、快乐、惊讶、讨厌和悲伤。他说,有成千上万的组合和变异。例如,讨厌就有不同种类,有对物理刺激的讨厌、有对精神刺激的讨厌等等。











Designing a speech application includes presenting data for delivery over the phone, constructing a call flow and enabling prompts and grammars. VoiceXML provides a common set of rules as a flexible foundation, but it’s up to the designer to create the appropriate flow and personality for a speech system.

Just as HTML content is interpreted by a browser and presented visually over the Web, so must VoiceXML be understood or interpreted for presentation over the telephone by a speech, or voice, browser. The speech browser serves as a gateway between a call and an Internet connection. It interprets VoiceXML code and manages dialog between callers and VoiceXML content located at a Web site.

Speech browser software also maintains the calls, presents voice prompts that equate to URLs and downloads pages for audio interaction.

A VoiceXML-based application using a speech browser provides flexibility, benefiting callers and content providers alike. A caller could use a rotary telephone or the newest wireless model and receive the same service. Content providers have a choice of locating a speech browser at their facilities or outsourcing to an application service provider, carrier or service bureau. As with current visual Web models, trade-offs have to be weighed between ease of implementation, flexibility, cost and other factors.

Today, companies are building businesses on speech-based Web content by providing telephony access and presentation of data in interactive audio formats. These businesses host speech applications to provide greater scalability, maintenance and support, while letting content providers focus on their core business.

A number of obvious and subtle factors are converging to bring the Web model of VoiceXML to prominence. Many consider the broad industry support of VoiceXML its most apparent strength. Other factors such as recent improvements in text-to-speech quality mean information can be immediately presented in audio format without the time and expense of recording a voice. Looking at the evolution of the Web, it’s clear the adoption of a common format for content presentation——HTML——fueled the growth of the Web. The VoiceXML standard holds similar promise for speech.











1、“计算机世界周报全文”频道内的“全文检索”处,在页面左上角;( http://www2.ccw.com.cn/)

2、E海航标搜索工具 http://www.ccw.com.cn/search/ (强力推荐使用!!)



