Agenda

Monday, Feb 12, 2024

Time	Activity
10am-12pm	Julia Basics Speaker: Nick Ulle (UC Davis), Derek Devnich (UC Merced), Carl Stahmer (UC Davis), Ezra Morrison (UC Davis) This workshop is an introduction to the Julia programming language for people familiar with R, Python, or MATLAB. Compared to those languages, Julia code typically runs orders of magnitude faster but has a similar level of abstraction, so you can focus on your research problem rather than hardware minutiae. Julia also provides out-of-the-box Unicode support, an easy-to-use package manager, multithreading facilities, a macro system, and a rich type system to optimize and prevent bugs in your code. Workshop topics include a concise overview of Julia's syntax and features, an end-to-end introduction to using built-in functions and contributed packages to read, summarize, and visualize tabular data, and real-world examples where we've found Julia beneficial. UCLA is hosting a watch-party at YRL: https://calendar.library.ucla.edu/event/11985555
12-1pm	Thinking about and finding health statistics & data Speaker: Michael Sholinbeck (UC Berkeley) Where can I find data on [fill in the blank]? Let your fingers take a dive into the world of health statistics and data! In this workshop, you will learn about some of the issues surrounding the collection of health statistics, and will also learn about authoritative sources of health statistics and data. We will explore tools that let you create custom tables of vital statistics (birth, death, etc.), disease statistics, health behavior statistics, and more. This workshop has good measures of both CONTEXT and CONTENT.
1-2pm	GIS and Mapping: where to start Speaker: Susan Powell (UC Berkeley) Interested in digital mapping and GIS (geographic information science), but not sure where to start? Have some experience, but want to learn more about what the UCs have to offer? This virtual workshop is for you! I'll provide an overview of the GIS and digital mapping landscape as a whole, including: which tools are out there and how to choose the right one for your needs, common terms used in the field, resources for learning how to get started mapping, and where to go to find data to create your first project. No experience or special software is required to participate in this workshop.
2-3pm	To Scan or Not to Scan: Construct 3D objects with Photogrammetry Speaker: Alvaro Alvarez (UC Riverside), Brendon Wheeler (UC Riverside), Kat Koziar (UC Riverside) In this hands-on hybrid workshop, learners will explore the relationships within data clouds and images to build and render 3D images. What does that really mean? That we’ll take lots of photos of an object to create 3d image data. We will step through capturing images using openly available software with a smart phone or tablet. We will also discuss what contributes to successful 3D scanning, and touch on different techniques currently used. Choose your favorite object (about 1 foot or 12 inches in height) to practice with during this hands-on workshop.
3-4pm	GitHub for Everyone Speaker: Julien Brun (UC Santa Barbara), Renata Curty (UC Santa Barbara), Greg Janee (UC Santa Barbara) In this workshop, we will explore the features of the GitHub website to manage and version content as a team. We will go over how to create a repository, edit files, and leverage the markdown syntax for documenting and tracking your work. We will also demonstrate how to invite collaborators. Participating in this workshop requires no coding skills or pre-installs (including the hands-on session). Everything will be done from your web browser!

Tuesday, Feb 13, 2024

Time	Activity
10am-12pm	Developing Your Data Science Portfolio Speaker: Michele Tobias (UC Davis), Pamela Reynolds (UC Davis) How can you demonstrate your technical skills to a potential employer, supervisor, funder, or collaborator? During this workshop we will discuss digital (online) portfolios, which complement your CV/resume by showcasing your coding, data visualization, data analysis, mapmaking, and other skilled technical abilities. We will discuss various purposes for digital portfolios, the components of a digital portfolio, methods and tools for creating one, and considerations for carefully curating and presenting your work in an engaging manner. This workshop is being offered virtually in conjunction with [UC Love Data Week](https://uc-love-data-week.github.io/).
12pm-1pm	Tips and Tools for Dealing with Large Datasets Speaker: Leigh Phan (UCLA), Kristian Allen (UCLA) Do you lie awake at night waiting for your R script to finish? Is your laptop running out of memory due to merging million row data frames? Do you panic at the thought of opening your data file in Excel? This presentation is for you! We’ll cover some low hanging fruit and basic approaches to squeeze the most out of your local machine when dealing with larger data sets. We will walk through some efficient command line tools, R libraries, and file formats to get the most out of your laptop.
1-2pm	Cultivating Collaboration: Getting Started with Open Research Speaker: Wasila Dahdul (UC Irvine), Ariel Deardorff (UC San Francisco), Reid Otsuji (UC San Diego), Sam Teplitzky (UC Berkeley) Open research or open science is the practice of making the process and products of research transparent and accessible to others for collaboration, reproducibility, and broader reuse. This workshop will provide an overview of open principles and practices that foster collaboration and make research outputs, such as data, methods, and publications, openly available. By the end of this workshop you will be able to: - Define open research and describe the benefits of putting it into practice - Adopt transparent and collaborative practices in your own research - Explore approaches to collaborative team science that promote openness - Locate key resources for open science and collaboration available at the University of California
2-3pm	Feeding Curiosity: A smorgasbord of open data sources for research Speaker: Stephanie Labou (UC San Diego), Kat Koziar (UC Riverside), Geoff Boushey (UC San Francisco), Christy Navarro (UC Davis Health Clinical and Translational Science Center), Andy Lyons (UC ANR), Ariel Deardorff (UC San Francisco) Join presenters across the UC system for a lively discussion on finding and accessing open data for research. We’ll set the table with an overview of open data, then have lightning talks about data resources we love, including: - curated collection of health data resources - international survey and census data - public datasets from Google BigQuery - searchable repository of municipal-level data - and a stockpile of unique and curious datasets from unexpected sources!
3-4pm	Code-free Data Analysis Speaker: Ann Glusker (UC Berkeley) Python and Stata and R, oh my! Is the prospect of having to learn to code getting you down? Do you ever feel like you'd like to just get into the data directly but aren't sure how? In this workshop we'll explore the many ways you can perform data analysis without needing to know statistical packages or languages first. We'll look at tools which let you upload your own data, tools for data visualization, and administrative data sites and tools which allow for interactive data requests and table creation, among other possibilities. No prior data analysis experience necessary.

Wednesday, Feb 14, 2024

Time	Activity
9-11am	All of Us: A Comprehensive Precision Medicine Data Hub for UC Researchers Speaker: Yury E. Garcia (UC Davis), Pamela Reynolds (UC Davis) All of Us (AoU) Research Program, is a National Institutes of Health (NIH) program that aims to enroll one million diverse participants to address historical gaps in medical research. The goal of the program is better health for all of us. Its mission is to accelerate health research and medical breakthroughs, enabling individualized prevention, treatment, and care for all. The program does this by leveraging the power of cloud computing, the program facilitates extensive collaborations and robust analyses. AoU compiles a comprehensive dataset (the All of Us Researcher Workbench) which includes survey responses, physical measurements, bio samples, Electronic Health Records (EHRs), and genomic data. As of November 2023, All of Us boasts over 741,000 enrolled participants, with 508,000 completing initial program steps, over 409,000 electronic medical records, and 525,000 bio samples. Notably, 75% of participants with available data on the All of Us Researcher Workbench identify with underrepresented communities, and 45% belong to racial and ethnic minority groups. The All of Us program provides a user-friendly environment for researchers to use the database through the All of Us Researcher Workbench. This platform offers an accessible and secure space to conduct analyses, ensuring that the wealth of data collected is readily available for scientific exploration. This dataset provides a significant step toward fostering innovation and cultivating a comprehensive understanding of health determinants in diverse populations. By encompassing this broad spectrum of health-related data, the All of Us dataset opens doors for a myriad of analyses, including Exposure Assessment, Health Disparities, Environmental Justice, Genetic-Environmental Interactions, and Behavioral Responses to Environmental Factors. Session Content: In the first hour, we will deliver an informative presentation covering the purpose All of Us program, the available participant data and resources, and key benefits for researchers. Additionally, we will showcase various research examples conducted using this extensive database. The second hour, we will focus on guiding participants through enrollment including where to access the database. We will elaborate on the different tools and resources available which provide insights into the workbench. This will include a detailed presentation on creating cohorts and datasets and instructions on exporting data for further analysis. This workshop is particularly relevant for researchers and students engaged in various fields, including public health, epidemiology, medicine, genetics, psychology, and any area where data from Fitbit, medical records, genetics, and social health determinants play a crucial role.
10am-11:30am	Trust Trek: Navigating the AI Trust Maturity Maze Speakers: Reema Hamasha (University of Michigan), Ariella Hoffman-Peterson (University of Michigan), Christy Navarro (UC Davis Health Clinical and Translational Science Center) Embark on a transformative journey by participating in resource demonstration at UC Love Data Week 2024 tailored for individuals engaged in data governance or health research involving algorithmic applications. Delve into the intricate world of Artificial Intelligence (AI) within the healthcare domain, where subject matter experts from various Clinical and Translational Science Award programs will guide you through the nuances of managing AI complexity. Our workshop places a special emphasis on establishing trust in AI applications and offers a unique maturity model designed specifically for health AI projects. Join us to explore best practices in evaluating AI within the context of technology, processes, and people. Uncover our systematic approach to navigating through diverse frameworks, as we elucidate the key elements and sub elements. This model, carefully crafted with research teams and data governance structures in mind, provides a comprehensive framework for ensuring the ethical and effective implementation of AI in health-related endeavors. Embrace this opportunity to gain valuable insights and enhance your proficiency in the evolving intersection of healthcare and artificial intelligence.
12-1pm	Dependency Management 101 Speaker: Renata Curty (UC Santa Barbara), Julien Brun (UC Santa Barbara), Greg Janee (UC Santa Barbara) Have you ever felt trapped in the maze of packages and library dependencies for your R or Python projects? Join us for a webinar to unravel the 'Dependency Hell' problem and provide practical strategies to tackle this issue while making your research more reproducible. This webinar is part of the 2024 UC Love Data Week Program.
1-2pm	Drone Data 101: The Data Pipeline Speaker: Sean Hogan (UC ANR), Andy Lyons (UC ANR) Drones are well-known for their breathtaking photos and video, but the real magic for researchers occurs when a series of images are stitched together with modern photogrammetry software, generating extremely high resolution, georectified orthomosaic images and 3D surface models. This workshop will cover the full data pipeline to utilize this remarkable technology, from flight planning, to data analysis, to visualization of the final results, with examples from plant science and ecology.
1-2:30pm	Wikipedia Edit-a-thon Speaker: Ann Glusker (UC Berkeley), Corliss Lee (UC Berkeley), Sarah Rosenkrantz (UC Berkeley), Misha Coleman (UC Berkeley), Marri Atienza (UC Berkeley) Wikipedia is among the most visited websites in the world; it is routinely cited in scholarly articles, the news, and congressional hearings. But Wikipedia openly admits its content is skewed by the gender and racial imbalance of its editors. You can be part of the solution! Join us on Zoom for a tutorial for the beginner Wikipedian, then edit some Wikipedia entries yourself (you may want to sign up for a free Wikipedia account in advance). You may bring topic ideas or articles you’d like to edit, or we’ll help you find a way to make a difference in an area of interest to you!
2pm-3pm	3D Data and Images Speaker: Alvaro Alvarez (UC Riverside), Brendon Wheeler (UC Riverside), Kat Koziar (UC Riverside) From virtual reality and animation to cultural research and preservation, thanks to advancing technologies, 3D data and imaging has expanded in accessibility and application integration. In this workshop, we will cover the basics of 3D data and imaging: different types, and where to find and share files. We will list openly available software used to modify 3D image data files, and describe situations to use which software and hardware.
3-4pm	Data Discovery and Evaluation in the Social Sciences: Present and Historical Speaker: David Michalski (UC Davis) This presentation on data discovery and evaluation examines today’s data landscape by providing an overview of data aggregators, repositories, and government information, as well as tips and tools that can be used to find and evaluate historical data sources. Participants will be introduced to a discovery process that will allow them to think creatively about how data is socially structured and how it can be repurposed for contemporary research.

Thursday, Feb 15, 2024

Time	Activity
10am-12pm	Overview of Remote and High Performance Computing (HPC) Speaker: Pamela Reynolds (UC Davis), Nick Ulle (UC Davis) Working on a research project but your data or models are too big and complex for your laptop? You need more compute power! During this session we'll discuss the differences and advantages of various remote and networked computing options, from servers in your lab to institutional HPC and cloud services. We'll cover an overview of HPC terminology, architecture, and general workflows. We'll also provide information about UC-specific computing resources and contacts. This workshop is a prerequisite introduction to DataLab's Introduction to Remote Computing series where you'll learn how to access and work efficiently on the UC Davis FARM HPC. By the end of this workshop attendees should be able to: - compare local, remote, HPC, and cloud computing options - define common compute jargon including cluster, supercomputer, nodes, - identify the skills needed for working on remote and HPC systems - describe how remote and HPC could benefit their research - identify where to go to learn more!
12-1pm	Tableau Workshop Speaker: Madison Bautista (UCLA) Learn how to utilize Tableau Prep Builder and Tableau to showcase your data. From data cleaning, hiding blank fields, to data filtering, Tableau Prep can be used to organize your data more efficiently. Once the data is clean, Tableau can be used to visualize the data ranging from charts, graphs, and even GPS coordinates. This small workshop will show you how anyone can easily use Tableau Prep and Tableau for data organization and analysis. In addition to Zoom, there is an on-site attendance option at UCLA in Public Affairs Building Room 2035B.
3-4pm	Telling Your Data Story Speaker: Beth Tweedy (UC Davis) Have you ever wondered if you should include a particular figure in a presentation? Or what about the order that you present you data in? This workshop will discuss tips, tricks, and approaches to using your data as a tool to tell your research story. We will discuss how data can fit into a narrative strucutre, how to keep a good flow going while presenting data, and more in this workshop. No previous experience or expertise is required to participate.

Friday, Feb 16, 2024

Time	Activity
10-11:30am	Data resources for Social Sciences & social media data ethics Speaker: Nashra Mahmood (UCLA), Tianji Jiang (UCLA) First Session: Sharing and reusing data can effectively reduce redundant efforts in data collection, ultimately saving valuable time and resources, especially for smaller laboratories. Additionally, it enhances the efficiency of scientific research investments by preventing the reinvention of the wheel. Building a sustainable data-reuse process and culture relies on frameworks that comprise policies, standards, roles, and responsibilities, all of which must address the diverse needs of data providers, curators, and (re)users alike. A critical step in the data sharing and (re)use cycle involves researchers identifying data that are (re)usable for their needs. It is a common challenge among researchers to locate data that appear (re)usable. In this session, I aim to introduce 2-3 data repositories ideal for researchers seeking (re)usable data for their studies. Additionally, I will offer several tips and strategies for identifying and utilizing (re)usable data effectively. The session is structured to last approximately 45 minutes, encompassing a 25-minute presentation, a 5-minute question-and-answer segment, and a 10-minute practical workshop exercise. Second Session: Social media is increasingly becoming an important site for data collection and research for students, researchers, and scholars across the Humanities and Social Sciences. This workshop introduces participants to the existing landscape of social media data and the fluctuating access academics have to ‘big datasets’. Using Facebook, Twitter (or X), and Reddit as examples of data archives, we discuss the latest guidelines and restrictions on scraping social media data from these three platforms and how they affect researchers conducting social media research. The workshop urges participants to rethink their notion of “Data” concerning social media to foreground the issue of research ethics that is often sidestepped or not taken seriously due to the presumed publicness of social media data.
1-2pm	Getting started with qualitative data analysis: open tools and methods Speaker: Ann Glusker (UC Berkeley) This workshop will give an overview of basic qualitative research methods, starting with (a little) theory, and moving on to sampling and theme development. We will then take a look at open tools for qualitative data analysis. After a brief demo of Taguette, other tools, such as QDA Miner, qcoder and QualCoder, will be reviewed. Additional resources will be mentioned and the slides/links made available.

Time

Activity

10-11:30am

Data resources for Social Sciences & social media data ethics
Speaker: Nashra Mahmood (UCLA), Tianji Jiang (UCLA)

First Session: Sharing and reusing data can effectively reduce redundant efforts in data collection, ultimately saving valuable time and resources, especially for smaller laboratories. Additionally, it enhances the efficiency of scientific research investments by preventing the reinvention of the wheel. Building a sustainable data-reuse process and culture relies on frameworks that comprise policies, standards, roles, and responsibilities, all of which must address the diverse needs of data providers, curators, and (re)users alike. A critical step in the data sharing and (re)use cycle involves researchers identifying data that are (re)usable for their needs. It is a common challenge among researchers to locate data that appear (re)usable. In this session, I aim to introduce 2-3 data repositories ideal for researchers seeking (re)usable data for their studies. Additionally, I will offer several tips and strategies for identifying and utilizing (re)usable data effectively. The session is structured to last approximately 45 minutes, encompassing a 25-minute presentation, a 5-minute question-and-answer segment, and a 10-minute practical workshop exercise. Second Session: Social media is increasingly becoming an important site for data collection and research for students, researchers, and scholars across the Humanities and Social Sciences. This workshop introduces participants to the existing landscape of social media data and the fluctuating access academics have to ‘big datasets’. Using Facebook, Twitter (or X), and Reddit as examples of data archives, we discuss the latest guidelines and restrictions on scraping social media data from these three platforms and how they affect researchers conducting social media research. The workshop urges participants to rethink their notion of “Data” concerning social media to foreground the issue of research ethics that is often sidestepped or not taken seriously due to the presumed publicness of social media data.

1-2pm

Getting started with qualitative data analysis: open tools and methods
Speaker: Ann Glusker (UC Berkeley)

This workshop will give an overview of basic qualitative research methods, starting with (a little) theory, and moving on to sampling and theme development. We will then take a look at open tools for qualitative data analysis. After a brief demo of Taguette, other tools, such as QDA Miner, qcoder and QualCoder, will be reviewed. Additional resources will be mentioned and the slides/links made available.