As researchers contemplate mining the students’ details, however, the university is grappling with ethical issues raised by the collection and analysis of these huge data sets, known familiarly as Big Data, said L. Rafael Reif, the president of M.I.T.
For instance, he said, serious privacy breaches could hypothetically occur if someone were to correlate the personal forum postings of online students with institutional records that the university had de-identified for research purposes.
It wasn’t so long ago that the excitement surrounding online education reached fever pitch. Various researchers offering free online versions of their university classes found they could attract vast audiences of high quality students from all over the world. The obvious next step was to offer far more of these online classes.
That started a rapid trend and various organisations sprung up to offer online versions of university-level courses that anyone with an Internet connection could sign up for. The highest profile of these are organisations such as Coursera, Udacity, and edX.
But this new golden age of education has rapidly lost its lustre.
German Chancellor Angela Merkel is proposing building up a European communications network to help improve data protection.
It would avoid emails and other data automatically passing through the United States.
In her weekly podcast, she said she would raise the issue on Wednesday with French President Francois Hollande.
Revelations of mass surveillance by the US National Security Agency (NSA) have prompted huge concern in Europe.
Disclosures by the US whistleblower Edward Snowden suggested even the mobile phones of US allies, such as Mrs Merkel, had been monitored by American spies.
Our home computer console will be used to send and receive messages—like telegrams. We could check to see whether the local department store has the advertised sports shirt in stock in the desired color and size. We could ask when delivery would be guaranteed, if we ordered. The information would be up-to-the-minute and accurate. We could pay our bills and compute our taxes via the console. We would ask questions and receive answers from “information banks”—automated versions of today’s libraries. We would obtain up-to-the-minute listing of all television and radio programs … The computer could, itself, send a message to remind us of an impending anniversary and save us from the disastrous consequences of forgetfulness.
It took decades for cloud computing to fulfill Baran’s vision.
Today’s big data is noisy, unstructured, and dynamic rather than static. It may also be corrupted or incomplete. “We think of data as being comprised of vectors – a string of numbers and coordinates,” said Jesse Johnson, a mathematician at Oklahoma State University. But data from Twitter or Facebook, or the trial archives of the Old Bailey, look nothing like that, which means researchers need new mathematical tools in order to glean useful information from the data sets. “Either you need a more sophisticated way to translate it into vectors, or you need to come up with a more generalized way of analyzing it,” Johnson said.
In their parents\’ attic, in boxes in the garage, or stored on now-defunct floppy disks — these are just some of the inaccessible places in which scientists have admitted to keeping their old research data. Such practices mean that data are being lost to science at a rapid rate, a study has now found.
The authors of the study, which is published today in Current Biology1, looked for the data behind 516 ecology papers published between 1991 and 2011. The researchers selected studies that involved measuring characteristics associated with the size and form of plants and animals, something that has been done in the same way for decades. By contacting the authors of the papers, they found that, whereas data for almost all studies published just two years ago were still accessible, the chance of them being so fell by 17% per year. Availability dropped to as little as 20% for research from the early 1990s.