Interfaces

  • James Rochez (4/23/2009)


    ...you should try a wireless brain-wave reader...

    I wouldn't dare subject a computer to the things in my head! :hehe:

    It'd probably crash and reboot like on this week's episode of Better Off Ted where the computer thought all the employees were crammed into a backpack and launched into space.

  • I haven't worked on VR before but I can just imagine that it cannot work as well as a keyboard. I have to say though, that I hate typing and if someone would invent a really good VR system then I would be glad. As someone said, I don't think that would work good on programming.

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • I guess a lot of us grew up seeing the Star Trek interface as the ideal - "Computer, compare the occurrences of this with the data on that...". But even assuming human-level undertanding, did you ever hear "No, computer, I wanted the nulls included, but not for the whole galaxy, just the main road past my house". That's a speech-to-meaning convertor you need there - another level of complexity!

  • People fail to realize how intimately human speech and language is related to the human brain and physiology. It's not just an information transfer, far more information is not said than actually is. Complex levels of thought structure are build within the listener's mind to approximately match structures in the speaker's mind and this is facilitated by the organization of language itself.

    Not just any mathematically possible language makes a usable human language. We are born with complex structures that expect various characteristics of language (parts of speech, organization of words, modifiers etc) and these are not random. As different as human languages appear to be, they are actually far more similar underneath than they are different.

    The specialization goes beyond that. There are only a limited number of sounds used in all the world's languages. These are the sounds that our brains are programmed to recognize, with minimal confusion from background noise, variations in speaker voice etc. Most languages use most of them, but not all of them. Young children become adept at hearing (as well as pronouncing) the sounds in their language environment; the one's they are not exposed to often atrophy. There are sound intonations in Vietnamese, for example that adult English speakers often cannot even hear, much less pronounce. There are sounds in English that confuse many foreign speakers.

    VR is not just pattern matching. A VR system needs to identify these phonems consistently from different speakers and voices, against background noise, colds, conflicting conversations etc. Additionally it requires at least a basic understanding of meaning before it can become reliable.

    We are a long way from that.

    ...

    -- FORTRAN manual for Xerox Computers --

  • jay holovacs (4/24/2009)


    There are sounds in English that confuse many foreign speakers.

    .. and indeed English speakers. 'scuse me while I kiss this guy[/url]

  • Ah now you are going to get me started on the "works" of the Rev. Dr. Spooner. That wonderful person from whom is taken the inspiration of the Spoonerism. As a former radio broadcaster I am much aware of the way American English can be amusing twisted. It's why I laugh when I see a "fire truck"

    Then there are the vocal tricks like the ICAO Alphabet (a better page exists here) and the old radio 10 codes (here )

    There you have the 4-1-1 on this. 10-4 good buddy?

    ATBCharles Kincaid

  • I've tried various voice recognition systems, and found them inadequate to my needs. I also type about 100 WPM, so maybe it's not a fair comparison.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • My supervisor and I were just having a discussion of similar topic yesterday regarding data entry. The topic at hand, however, was not V/R software it was whether it would be a good idea to utilize our OnBase Software to autofill data based documents submitted. I have reservations on this methodology due to inaccuracies in data submitted that would not be caught by the software. An example of such an issue happened yesterday. An e-mail address was submitted for a preceptor did not conform to the spelling of the preceptor's name. This would have been perfectly acceptable and there may have been a logical reason behind this, but to verify, I contacted the sender of the data and she confirmed that there was a typo on the original submission. This would not have been caught by the software and while the notion exists that we can scan through the data afterwards, for me personally, I am doubtful in my ability to catch a mistake by simply scanning the data - I need to be more interactive with it (manual data entry).

  • Nice re-run but I think voice recognition technology has come a long way since 2009. I manage group of hospitals infrastructure and I have to say that the Nuance products are some of the best tested I've worked with. Our medical staff loves it compared to they have been using for years. We are running this in a SQL Server 2k8r2 VM environment and The performance is fantastic and the support has been great. There voice recognition is miles above what siri or android puts out on its consumption devices.

  • I see two big problems with voice recognition as an input method:

    1) Current systems require training to be any good. The variation in sound between different speakers of a language can be very large, and this can challenge the training. This can get really hard when the range of sound variation is such that the set of phonemes differs from person to person, as it does in every language I know, which can mean that things which are distinct for one person are homophones for another. For example in English, some speakers pronounce /d/ and /d/ the same, so that the only way one can distinguish "udder" from "other" is by context. To handle that adequately the VR system has to have enough understanding to make choices based on context even once its training has taught it that the two phonemes are merged in this person's speech.

    2) While there may be resonable VR systems for the prinicpal sound patterns on major languages like English, Spanish, French, German, Mandarin, Russian,... for the majority languages there is as yet no decent VR system (often no VR system at all).

    Obviously point (2) above will become less important over time, as more and more languages die out and/or as it becomes commercially sensible to provide good VR systems for less important languages, or for less important variants of the important languages.

    However, point (1) is not going to go away any time soon, and may get worse as systems try to cater for more variation between diferent users' speech. For example a good VR system for Irish would have to cope with at least the 7 dialects (Waterford, Kerry, West Cork, South Connemara, mid-Connacht, Achill/Erris, Na Rossa/Gaoth Dobhair) which are used by most native speakers and might also have to cope with An Caideán Oifigiúil (the official standard for Irish) which is spoken by some (mostly anglophone Irish) who have learnt Irish as a "foreign" language; but these 8 forms differ not only in pronunciation and vocabulary, but also in accidence, grammar, and sytnax, which strikes me as posing a problem that may not be solved any time soon. A similar problem exists for Scottish Gaelic, perhaps not quite as bad as for Irish Gaelic. For German too, unless one wishes to declare Steurisch and Bayrisch and all the other dialects "not German", which would hardly be satsfactory, and in French (perhaps to a lesser extent) and probably in every language with more than ten thousand speakers. And of course it's even worse in English.

    There is another problem which others have mentioned, and that's input speed. It's no longer a pronlem for me, as I now type much more slowly than I used to, so my speech speed might be as fast as my typing speed. But for people a bit younger who are competent at typing it's probably a real problem.

    One concern some people have expressed I think is a false concern: there's no reason why VR should be less accurate than typed input provided the input voice is within the range that the VR systems is designed to cope with and the VR system has been trained for that voice. However, for some languages it may be some time before VR systems can cope with an acceptable range. For example, I can't imagine the City authorities in Waterford daring to advertise for a data entry clerk who speaks Gaoluinn with the accent of An Rinn, people who speak Gaeilge with a South Connemara accent need not bother to apply: the inevitable uproar about unacceptable discrimination would probably be too great. That probably means that the problem of handling quite a few variants of the language has to be soved before VR becomes a usable data entry method for Irish. So while VR may be OK for some applications now, it isn't OK for all applications in all languages yet.

    Tom

  • On the general data accuracy issue, data entry isn't the only source of errors. As I see it, there are at least five sources of error:

    1) Data entry

    2) Inaccuracy of data source

    3) corruption of data in passage from source to data entry (whether by deliberate falsification or by accident)

    4) application errors

    5) failure of DB to provide adequate data integrity assurance (bad schema design not coped with by code)

    I distinguish between 4 and 5 because 4 is something that should be avoided by having correct application code while 5 is something that should be fixed by changes within the databse - after all, there's no point in writing an application that does the wrong thing to the data and equally there's no point in using an RDBMs if you aren't going to design the schemata for simple data integrity.

    Concentrating on just number 1 in that list is maybe not a good idea, but that's what happened both in the editorial and in the resulting discussion.

    A DBA or database developer can only prevent errors falling under 5 in my list, and attempt to detect and maybe correct errors falling under 1, 2, and 3. But there should be someone responsible for the overall system whose job it is to ensure that point 3 errors are extremely rare, and someone else who is responsible for ensuring that point 2 errors are extremely rare and as far as is possible that most of those that do happen are detected and corrected before the data is sent for data entry. Ideally someone should be reponsible for ensuring that most data entry errors are detected and corrected before the database has to worry about them - entering everything twice is expensive, but was the traditional method of doing this, and there's no reason why that should be different with VR than with typing.

    Tom

  • L' Eomot Inversé (11/5/2013)


    One concern some people have expressed I think is a false concern: there's no reason why VR should be less accurate than typed input provided the input voice is within the range that the VR systems is designed to cope with and the VR system has been trained for that voice. However, for some languages it may be some time before VR systems can cope with an acceptable range. For example, I can't imagine the City authorities in Waterford daring to advertise for a data entry clerk who speaks Gaoluinn with the accent of An Rinn, people who speak Gaeilge with a South Connemara accent need not bother to apply: the inevitable uproar about unacceptable discrimination would probably be too great. That probably means that the problem of handling quite a few variants of the language has to be soved before VR becomes a usable data entry method for Irish. So while VR may be OK for some applications now, it isn't OK for all applications in all languages yet.

    Being semi-familiar with how things work in the U.S. media circles, they actually have classes on how to speak with a mid-Western accent. It actually is sort of the U.S. news TV reporter standards. So creating a Gaelic "standards" organization and putting that in as the requirement in an employment advertisement would be the work-around. Same with the Russian republic and other multi-ethnic or regional countries in Europe.

    But really the original post is already dated. If you look at the progression of cochlear implants, Google glasses, the possibility of prosthetic eyes, and the various experimental limb prosthetics that can give "touch" feedback we are probably within a generation of a true human/computer interface.

    Plus the advance in robotics will help as well. There was a recent NOVA[/url] show (not sure if the right one) that is developing a combo of an elephants trunk and some arthropods grasper that could pick up an egg or a 300 pound load with the same arm, but the arm, by itself, wouldn't hurt a human with a physical contact.

    So maybe the GIGO will come back to the old Elephant in the room parable.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply