'Languages' in cross pond high tech

Tags
Current selected tag: 'Languages'. Clear

OpenAI has published the text-generating AI it said was too dangerous to share | cross pond high tech | Scoop.it

From www.theverge.com - November 8, 2019 5:50 AM

The research lab OpenAI has released the full version of a text-generating AI system that experts warned could be used for malicious purposes.

The institute originally announced the system, GPT-2, in February this year, but withheld the full version of the program out of fear it would be used to spread fake news, spam, and disinformation. Since then it’s released smaller, less complex versions of GPT-2 and studied their reception. Others also replicated the work. In a blog post this week, OpenAI now says it’s seen “no strong evidence of misuse” and has released the model in full.

GPT-2 is part of a new breed of text-generation systems that have impressed experts with their ability to generate coherent text from minimal prompts. The system was trained on eight million text documents scraped from the web and responds to text snippets supplied by users. Feed it a fake headline, for example, and it will write a news story; give it the first line of a poem and it’ll supply a whole verse.

It’s tricky to convey exactly how good GPT-2’s output is, but the model frequently produces eerily cogent writing that can often give the appearance of intelligence (though that’s not to say what GPT-2 is doing involves anything we’d recognize as cognition). Play around with the system long enough, though, and its limitations become clear. It particularly suffers with the challenge of long-term coherence; for example, using the names and attributes of characters consistently in a story, or sticking to a single subject in a news article.

The best way to get a feel for GPT-2’s abilities is to try it out yourself. You can access a web version at TalkToTransformer.com and enter your own prompts. (A “transformer” is a component of machine learning architecture used to create GPT-2 and its fellows.)

Babel phish: In which languages are internet passwords easiest to crack? | cross pond high tech | Scoop.it

From www.economist.com - June 6, 2012 1:53 PM

In which languages are internet passwords easiest to crack?

DESPITE entreaties not to, many people choose rather predictable passwords to protect themselves online. "12345"; "password"; and the like are easy to remember but also easy for attackers to guess, especially with programs that automate the process using lists ("dictionaries") of common choices. Cambridge University computer scientist Joseph Bonneau has recently published an analysis of the passwords chosen by almost 70m (anonymised) Yahoo! users. One interesting result is shown below. The chart shows what percentage of accounts could be cracked after 1,000 attempts using such a dictionary. Amateur linguists can have fun speculating on why the Chinese do so well and the Indonesians do not. But one particularly interesting twist is how little difference using language-specific dictionaries makes. It is possible to crack roughly 4% of Chinese accounts using a Chinese dictionary; using a generic dictionary containing the most common terms from many languages, that figure drops only slightly, to 2.9%. Speakers of every language, it seems, have fairly similar preferences.

OpenAI has published the text-generating AI it said was too dangerous to share

Babel phish: In which languages are internet passwords easiest to crack?