2024 Week 11

It’s August 22 today, so not technically Week 11 (whoops)… But I was traveling/moving around from Texas to Bangkok for ACL and didn’t get a chance to put my final blog post up.

2024 Week 10

My goodness, week 10! I can’t believe that it’s already August! :)

2024 Week 9

Week 9! On Thursday, I had a really awesome buffet lunch with my PI at the The Carillon. The food was very delicious (I ate brisket, creamy potato soup, many desserts, etc.) :P and I really enjoyed talking to my advisor about her family, travels, research journey and advice, drama about American politics (too crazy and political AI fashion show video), and future project directions.

2024 Week 8

Week 8! I moved to Austin the past Saturday, it’s crazy to think that it’s been a week here! It’s been really awesome and I love the UT community, convenience, apartment, and area. 🌞🤠

2024 Week 7

Week 7! As the heat is approaching, summer is going by so fast unfortunately… mid July is approaching 🌞

2024 Week 6

Week 6! Happy July 4th to everyone! 🎆🎆

2024 Week 5

Week 5! The actual final days of June are coming. Summer is flying by, and the time crunch towards September is starting to feel more intense.

2024 Week 4

Week 4! My goodness, I’m surprised it’s nearly the end of June–yet so many questions are yet to be answered about the direction of our research and how our paper wants to go. This week has mostly been trying to brainstorm new directions and studying related literature of where this research could take us. After doing a very small annotation sample, and with the professors being away for NAACL, we still need to finalize our annotation guidelines, and get started on our annotation ASAP if we want to submit by ICLR 2025 and have time for analysis, writing, and followup experiments.

2024 Week 3

Third week! As an update, I reformatted the data into a compatible json to view with Sebastian’s viewer. :) After reaching out and talking to Somin, who was very helpful, he suggested to use a more SOTA mebedding model, thus we decided to continue with my Alibaba model approach. I am happy I don’t have to embed a dataset of 800,000 abstracts again. From the annotation and seeing the abstracts so far, it seems that compared to the DPR Roberta baseline method, the Alibaba retriever is retrieving much more relevant abstracts that are helpful to answering and checking the claim, which is good.

2024 Week 2

Second week! As an update, I finished the embeddings of the entire dataset (800,000 different medical abstracts) from the trialstreamer dump, after long nights of running experiments with the A100 and using PyTorch tricks like making the batch size smaller :)

2024 Week 1

Hi everyone! My first blog post of 2024! I’m really excited to start continue DREU this summer with Professor Jessy Li. I’m very happy to be co-leading with my friend and previous collaborator, Sebastian Joseph, as well as being advised by Professor Byron Wallace.

2023 Week 10

My final week of the official DREU program! Overall this summer I learned a lot more about LLMs, medical simplification, annotations, factuality, and natural language processing research in general. Furthermore, I got to collaborate with very suportive, encouraging, and helpful peers and mentors. I will be moving back from Austin soon and really enjoyed my month researching in Texas!

2023 Week 9

Halfway through my time at Austin, I’m actually starting to get used to the blistering heat :)

2023 Week 8

I really enjoyed meeting Professor Li, grabbing coffee, and discussing the next project steps with Byron virtually in the office!

2023 Week 7

Moved to Austin! My first time getting scorched in 103+ degree weather throughout the day…

2023 Week 6

Moving to Austin, Texas soon!

2023 Week 5

I’ve tried FlanALPACA, GPT-3.5, ALPACA, FLAN-T5-XL, REDPAJAMA, FALCON, and 4-5 different prompts for each, as well as different parameter size variants for each model across the first ~200 abstracts in one of Byron’s trialstreamer datasets

2023 Week 4

What babysitting LLM generations in the middle of the night looks like:

2023 Week 3

Out of extreme frustration and hours spent on debugging conda environments, python compatability, un-installing and re-installing packages, I decided to purchase Google Colab Pro + I am so happy to not worry about the compute problems and that I can run many different LLM pipelines on different sessions at once :)

2023 Week 2

My second week of digging into LLM related research… I got the data of abstracts from Byron’s trialstreamer source, but am running into some extremely irritating compute limitations. To run these extremely large language models, of up to 10 billion parameters, I need to have access to High-RAM GPUs Currently I have access to two sources:

MIT Satori
Google Colab Free

2023 Week 1

Hi everyone! My first blog post! I’m really excited to start my DREU experience with Professor Li. I just had a first virtual group meeting and met Sebastian, a fellow student researcher at UT Austin. I also met Professor Byron Wallace.