$ zcat Sources.gz | grep -o -E 'debhelper [(]>= 9[.][0-9]{,7}([^0-9)][^)]*)?[)]' | sort | uniq -c | sort -rn 338 debhelper (>= 9.0.0) 70 debhelper (>= 9.0) 18 debhelper (>= 9.0.0~) 10 debhelper (>= 9.0~) 2 debhelper (>= 9.2) 1 debhelper (>= 9.2~) 1 debhelper (>= 9.0.50~)
Is it a way to protest against the current debhelper's version scheme?
In i18nspector I try to support all the encodings that were blessed by gettext, but it turns out to be more difficult than I anticipated:
$ roundtrip() { c=$(echo $1 | iconv -t $2); printf '%s -> %s -> %s\n' $1 $c $(echo $c | iconv -f "$2"); } $ roundtrip ¥ EUC-JP ¥ -> \ -> \ $ roundtrip ¥ SHIFT_JIS ¥ -> \ -> ¥ $ roundtrip ₩ JOHAB ₩ -> \ -> ₩
Now let's do the same in Python:
$ python3 -q >>> roundtrip = lambda s, e: print('%s -> %s -> %s' % (s, s.encode(e).decode('ASCII', 'replace'), s.encode(e).decode(e))) >>> roundtrip('¥', 'EUC-JP') ¥ -> \ -> \ >>> roundtrip('¥', 'SHIFT_JIS') ¥ -> \ -> \ >>> roundtrip('₩', 'JOHAB') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <lambda> UnicodeEncodeError: 'johab' codec can't encode character '\u20a9' in position 0: illegal multibyte sequence
So is 0x5C a backslash or a yen/won sign? Or both?
And what if 0x5C could be a second byte of a two-byte character? What could possibly go wrong?
Have you ever wanted to use Lintian's spell-checker against arbitrary files? Now you can do it with spellintian:
$ zrun spellintian --picky /usr/share/doc/RFC/best-current-practice/rfc* /tmp/0qgJD1Xa1Y-rfc1917.txt: amoung -> among /tmp/kvZtN435CE-rfc3155.txt: transfered -> transferred /tmp/o093khYE09-rfc3481.txt: unecessary -> unnecessary /tmp/4P0ux2cZWK-rfc6365.txt: charater -> character
mwic (Misspelled Words In Context) takes a different approach. It uses classic spell-checking libraries (via Enchant), but it groups misspellings and shows them in their contexts. That way you can quickly filter out false-positives, which are very common in technical texts, using visual grep:
$ zrun mwic /usr/share/doc/debian/social-contract.txt.gz DFSG: | …an Free Software Guidelines (DFSG) | …an Free Software Guidelines (DFSG) part of the ^^^^ Perens: | Bruce Perens later removed the Debian-spe… | by Bruce Perens, refined by the other Debian… ^^^^^^ Ean, Schuessler: | community" was suggested by Ean Schuessler. This document was drafted ^^^ ^^^^^^^^^^ GPL: | The "GPL", "BSD", and "Artistic" lice… ^^^ contrib: | created "contrib" and "non-free" areas in our… ^^^^^^^ CDs: | their CDs. Thus, although non-free wor… ^^^
Instances of the “for those who care about X” snowclone on Debian mailing lists:
Releasing the shift key is hard.
TL;DR: don't put valuable passwords in ~/.netrc
In the olden days, the ~/.netrc file was used for storing FTP usernames and passwords. These days we have clients of other protocols that use said file. Perhaps your IMAP or SMTP client use it. So you put your e-mail accounts password into ~/.netrc, and then meticulously configured the clients to always connect via TLS and to verify server certificates. You feel secure.
But you shouldn't. Here's how an attacker capable of MiTM can exploit wget to steal ~/.netrc passwords:
1) Alice tries to download a file over HTTP:
$ wget http://xkcd.com/538/
2) Eve takes over the HTTP connection, sending a redirection response:
HTTP/1.1 303 See Other Location: http://supersecuremail.example.net/
3) Alice's wget follows the redirection.
4) Eve takes over the connection to supersecuremail.example.net, requesting password authentication:
HTTP/1.1 401 Unauthorized WWW-Authenticate: Basic realm="moo"
5) Alice's wget sends the supersecuremail.example.net password straight to Eve.
Setting internationalization environment variables is a bit tricky. For example, this:
$ LANG=sv_SE.UTF-8 stat /nonexistent
may look like a way to make stat(1) print the error message in Swedish. Yet there are many ways it could go wrong:
To make these things a little less intricate, I wrote localehelper. It's a bit like env(1), but it takes care of:
This does the right thing:
$ localehelper LANG=sv_SE.UTF-8 stat /nonexistent stat: kan inte ta status på ”/nonexistent”: Filen eller katalogen finns inte
I've just released initial version of gettext-inspector, a tool for checking gettext PO/POT/MO files. While it's in an early stage of development, it's already able to detect wide rage of problems. For example, this is what it emits on my system:
$ gettext-inspector /usr/share/locale/*/LC_MESSAGES/*.mo | cut -d ' ' -f 1,3 | sort | uniq -c | sort -rn 1902 P: no-language-header-field 1601 P: no-version-in-project-id-version 1372 W: no-report-msgid-bugs-to-header-field 273 P: invalid-content-transfer-encoding 201 W: invalid-date 78 W: syntax-error-in-plural-forms 77 I: no-package-name-in-project-id-version 63 W: boilerplate-in-report-msgid-bugs-to 50 W: language-disparity 47 I: unknown-header-field 38 W: invalid-language 25 W: boilerplate-in-project-id-version 10 I: unknown-poedit-language 8 W: no-date-header-field 5 W: no-project-id-version-header-field 5 W: c1-control-characters 3 P: no-mime-version-header-field 3 P: no-content-transfer-encoding-header-field 2 W: non-portable-encoding 1 W: invalid-report-msgid-bugs-to 1 W: ancient-date 1 I: unable-to-determine-language
I tried many spell-checkers for irssi, and they all sucked (and some of them were also completely insane). So I took the one that seemed least crazy and forked it.
I present you adequate, a tool that checks for some common bugs in packages you have installed on your system. The Debian package comes with APT hooks, so that you'll be notified (via debconf) every time you install an adequately-buggy package.
If you don't like broken symlinks, disappearing /usr/share/doc/*/ directories, obsolete conffiles, or *.py files without corresponding *.pyc, adequate is for you.
$ zcat Sources.gz | grep -o 'debhelper (>= 9.0[^)]*)' | sort | uniq -c | sort -rn 47 debhelper (>= 9.0.0) 13 debhelper (>= 9.0) 4 debhelper (>= 9.0.0~) 3 debhelper (>= 9.0~)
Hey, wake up! debhelper is no longer using such versioning scheme. Simple (>= 9) would be both shorter and less silly.
Here's why.