Data Engineer Position in Arnaout Laboratory at UCSF

We are hiring a data engineer with mid-career skills and experience. Applications will be accepted on a rolling basis.

ABOUT. The Arnaout laboratory studies deep learning and other computational methods for biomedical imaging and related clinical data, with the goals of decreasing diagnostic error and developing and scaling novel phenotypes to drive precision medicine. UCSF is a top-10 medical center and a leader in cross-campus efforts to mine, harmonize, and analyze multi-modal clinical data for the University of California’s 15 million patients. The Arnaout laboratory is part of both the Bakar Computational Health Sciences Institute and the nationally ranked Department of Medicine. Projects focus on deep learning for medical imaging, and through collaborative work with intra- and inter-institutional partners, also involve the electronic health record, genetics, and other data types.

POSITION. The data engineer position offers an opportunity to participate in cutting-edge research with transformational impact to clinical and research medicine across a wide array of diseases, working with decades of high-quality medical data alongside clinical domain experts. The position also provides opportunities to publish, present at research conferences, and for professional advancement. Salary and benefits are set according to experience and to UCSF salary scales.

CONTACT. Inquiries should be emailed directly to Dr. Arnaout (rima.arnaout (at) ucsf.edu) and should include a CV and a clear but brief letter of interest.


Requirements:
• a Masters, PhD or equivalent degree and/or experience in computer science, data science/analytics, or related field
• experience with local, hybrid, and cloud computing/HPC including strategies for automation, scaling, and crowdsourcing of data mining, preprocessing, storage, labeling and computing
• experience with data containerization strategies (e.g. Docker, Kubernetes)
• experience building, managing, querying structured and unstructured databases
• strong interpersonal, organizational, record-keeping, written & oral communication skills
• working with patient data in a HIPAA-compliant and morally and ethically responsible manner
• working independently to complete assigned responsibilities
• ability to work with an interdisciplinary team of both medical and data science professionals
• strong motivation to apply pioneering breakthroughs to the practical, personalized patient care
• fluency with Python (ideally using Pandas, Keras, Tensorflow/Theano, etc) and related programming languages, as well as data visualizations (ideally using d3.js, R, Matlab, pylab, seaborn, etc).


In addition to the above, the successful applicant will:
• be proficient in creating pipelines for organized, reproducible, scalable, and secure data mining, cleaning/preprocessing, storage, labeling and computing
• have experience writing/deploying web applications
• have experience in neural network design and optimization
• have a working knowledge of biology and human physiology, and/or a desire to learn relevant concepts
• have an interest in working in an academic research environment