2024 Week 11

It’s August 22 today, so not technically Week 11 (whoops)… But I was traveling/moving around from Texas to Bangkok for ACL and didn’t get a chance to put my final blog post up.

Read More

2024 Week 9

Week 9! On Thursday, I had a really awesome buffet lunch with my PI at the The Carillon. The food was very delicious (I ate brisket, creamy potato soup, many desserts, etc.) :P and I really enjoyed talking to my advisor about her family, travels, research journey and advice, drama about American politics (too crazy and political AI fashion show video), and future project directions.

Read More

2024 Week 8

Week 8! I moved to Austin the past Saturday, it’s crazy to think that it’s been a week here! It’s been really awesome and I love the UT community, convenience, apartment, and area. 🌞🤠

Read More

2024 Week 7

Week 7! As the heat is approaching, summer is going by so fast unfortunately… mid July is approaching 🌞

Read More

2024 Week 5

Week 5! The actual final days of June are coming. Summer is flying by, and the time crunch towards September is starting to feel more intense.

Read More

2024 Week 4

Week 4! My goodness, I’m surprised it’s nearly the end of June–yet so many questions are yet to be answered about the direction of our research and how our paper wants to go. This week has mostly been trying to brainstorm new directions and studying related literature of where this research could take us. After doing a very small annotation sample, and with the professors being away for NAACL, we still need to finalize our annotation guidelines, and get started on our annotation ASAP if we want to submit by ICLR 2025 and have time for analysis, writing, and followup experiments.

Read More

2024 Week 3

Third week! As an update, I reformatted the data into a compatible json to view with Sebastian’s viewer. :) After reaching out and talking to Somin, who was very helpful, he suggested to use a more SOTA mebedding model, thus we decided to continue with my Alibaba model approach. I am happy I don’t have to embed a dataset of 800,000 abstracts again. From the annotation and seeing the abstracts so far, it seems that compared to the DPR Roberta baseline method, the Alibaba retriever is retrieving much more relevant abstracts that are helpful to answering and checking the claim, which is good.

Read More

2024 Week 2

Second week! As an update, I finished the embeddings of the entire dataset (800,000 different medical abstracts) from the trialstreamer dump, after long nights of running experiments with the A100 and using PyTorch tricks like making the batch size smaller :)

Read More

2024 Week 1

Hi everyone! My first blog post of 2024! I’m really excited to start continue DREU this summer with Professor Jessy Li. I’m very happy to be co-leading with my friend and previous collaborator, Sebastian Joseph, as well as being advised by Professor Byron Wallace.

Read More

2023 Week 10

My final week of the official DREU program! Overall this summer I learned a lot more about LLMs, medical simplification, annotations, factuality, and natural language processing research in general. Furthermore, I got to collaborate with very suportive, encouraging, and helpful peers and mentors. I will be moving back from Austin soon and really enjoyed my month researching in Texas!

Read More

2023 Week 9

Halfway through my time at Austin, I’m actually starting to get used to the blistering heat :)

Read More

2023 Week 8

I really enjoyed meeting Professor Li, grabbing coffee, and discussing the next project steps with Byron virtually in the office!

Read More

2023 Week 7

Moved to Austin! My first time getting scorched in 103+ degree weather throughout the day…

Read More

2023 Week 5

I’ve tried FlanALPACA, GPT-3.5, ALPACA, FLAN-T5-XL, REDPAJAMA, FALCON, and 4-5 different prompts for each, as well as different parameter size variants for each model across the first ~200 abstracts in one of Byron’s trialstreamer datasets

Read More

2023 Week 3

Out of extreme frustration and hours spent on debugging conda environments, python compatability, un-installing and re-installing packages, I decided to purchase Google Colab Pro + I am so happy to not worry about the compute problems and that I can run many different LLM pipelines on different sessions at once :)

Read More

2023 Week 2

My second week of digging into LLM related research… I got the data of abstracts from Byron’s trialstreamer source, but am running into some extremely irritating compute limitations. To run these extremely large language models, of up to 10 billion parameters, I need to have access to High-RAM GPUs Currently I have access to two sources:

  1. MIT Satori
  2. Google Colab Free
Read More

2023 Week 1

Hi everyone! My first blog post! I’m really excited to start my DREU experience with Professor Li. I just had a first virtual group meeting and met Sebastian, a fellow student researcher at UT Austin. I also met Professor Byron Wallace.

Read More