Arts, buffer, check, clutter, cobbler, colorblind, concurrent, convert, cook, crash, dialog, dump, expect, file, folks, fortune, genius, global, hello, indent, less, links, meanwhile, mirror, screen, sparse, suck, tree, units, words. What do these ordinary English words have in common? They are also names of software projects, which becomes a problem if you want to recognize package names in text. I understand that in the old days, the name of a command or application was only relevant in the context of the computer it ran on, and file names had to be short. Some of these names have allowed for a variety of jokes. But why, in the age of portable programs, WWW and search engines, can’t people come up with less ambiguous names? I mean, it’s not hard to join two words, or, at a minimum, prefix a word with a vowel, like, uhm, a round fruit does. 🙂
Oh, and did I mention that we have over 160 packages with a 2-3 letter name? The one mentioned in the title is a programming language, btw.
Both comments and pings are currently closed.
LOL.
May be some kind of context awareness could help. But given that it is possible to fool even human-beings with clever statements that could be ambiguous, it can never be perfect.
I think you have to blacklist such packages and avoid detecting those packages. Let those bugs rot as a punishment. 😉
Yes, I’m not writing a paper on computational linguistics, so I’ll just blacklist a number of packages. It’s not just English words, BTW, but the bot probably won’t be able to recognize a bug in the “rpm” program itself and a couple of others. Too bad the hackweek is over now and I have nothing close to an alpha version 🙁