Joseph Hellerstein uses machine learning to search science data
Prof. Joseph Hellerstein is one of the principal investigators of a research team who are developing innovative machine learning tools to pull contextual information from scientific datasets and automatically generate metadata tags for each file. Scientists can then search these files via a web-based search engine for scientific data, called Science Search, that the Berkeley team is building. The work is being done in conjunction with the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) including principal investigators Katie Antypas, Lavanya Ramakrishnan and Gunther Weber. “Our ultimate vision is to build the foundation that will eventually support a ‘Google’ for scientific data, where researchers can even search distributed datasets,” said Ramakrishnan. “Our current work provides the foundation needed to get to that ambitious vision.”