Our team is committed to developing advanced methodologies tailored specifically for multilingual environments. We focus on pre-training multilingual models, enhancing the quality of multilingual instruction-tuning datasets, refining multilingual evaluation processes, boosting knowledge transfer across languages, and optimizing multilingual tokenization among other initiatives.
The Technology & Society organization connects research, people, and ideas across Google and Alphabet to help shape and advance our most ambitious technology innovations and initiatives and their impact on users and society for the better, and responsibly. In addition, we also aim to share perspectives, engage, and collaborate with others externally on technology related issues and opportunities for society.
Responsibilities
Author research papers to share and generate impact of research results across the team and in the research community.
Research and develop technology for improving multilingual Large Language Models (LLM) such as instruction-tuning, pre-training, multilingual reasoning.
Research and develop technology for pre-training LLMs for target languages other than English.
Collaborate with other research teams to expand multilingual LLM technology.
Collaborate with Google first-party partner teams to deliver new multilingual technologies to production.
PhD in Computer Science, a related field, or equivalent practical experience.
Coding experience in Python, JavaScript, R, Java, or C++.
One or more scientific publication submission(s) for conferences, journals, or public repositories.
Preferred qualifications:
2 years of coding experience in Python, JavaScript, R, Java, or C++.
1 year of experience owning and initiating research agendas.
Experience with modern LLMs and generative models.
Experience with multilingual LLMs.
Recent publication track in related GenAI fields.