Hackathon Winners Used Image Transformers and Clustering Algorithms to Achieve Impressive Results

Hackathon Winners Used Image Transformers and Clustering Algorithms to Achieve Impressive Results

This year’s ASME Student Hackathon participants faced a FactoryNet problem of cleaning up human annotated labels. The winning solution was machine learning tested and presented surprising results.
ASME Computer and Information in Engineering (CIE) Division held its 2024 Hackathon back in August. Hosted by the International Design Engineering Technical Conference & Computers and Information in Engineering Conference in Washington, D.C., the on-site and virtual preconference event provided students a once-in-a-lifetime opportunity to learn how data science and machine learning techniques are used to solve real-world engineering problems.

Engineering student participants competed for cash prizes while tackling the complex challenges. After a week-long sprint, it was the team of Mutahar Sadfar and William Jabbour from McGill University that used image transformers and clustering algorithms to group images together based on visual similarity that took first place.

First-place presentation of Mutahar Safdar and William Jabbour showing a plot of visual similarity represented in two dimensions using a data-efficient image transformer and several images from the dataset that stood out as outliers using this process.
Seven teams submitted answers for the FactoryNet dataset problem, a challenge presented by UES Inc., a Blue Halo company, and Air Force Research Laboratory (AFRL). This problem tasked the participants to contribute to the early stages of a growing labeled image dataset, FactoryNet. Specifically, competitors cleaned up the human annotated labels into meaningful classes and tested if those classes could be used to create machine learning models. 

The task of data sanitization is not a new one in the data science world, but modern transformer models and large language models (LLMs) have the potential to change and accelerate the process. Though the students’ solutions varied in method, the overall submissions showcased excellent strategies of how best to leverage cutting edge tools and get a leg up on the competition for such an open-ended problem. 

The majority of teams used visual language models (VLMs) and other multimodal transformer models to interact with the image and label data. Interestingly, the application details of each use of transformer models varied. The winners used image transformers and clustering algorithms to group images together based on visual similarity. This analysis was able to find groups of images that were visually dissimilar because they were not in the appropriate manufacturing domain. 

Erik Braham, digital manufacturing research scientist, UES, presents the FactoryNet hackathon problem second-place award to Nazanin Mahjourian of Michigan Tech.
FactoryNet images were gathered from Wikimedia categories of manufacturing relevant public domain images. Human labeling removed most off-topic images, but errors resulted in some images remaining in the dataset. The power of combining a VLM and clustering algorithms to find these unique cases was impressive and contributed to the team’s win when combined with their approach and strategy.

Overall, all the submissions illustrated the type of solutions that could efficiently solve the data sanitization problems at scale for the ever-expanding FactoryNet dataset. Other excellent efforts, included the second-place winning effort of Nazanine Mahjourian of Michigan Technological University, who used Open AI’s CLIP to find how close a label was to the image in the eyes of the VLM and subsequently create hierarchical classes through clustering similar labels. Other submissions also used Open AI’s CLIP model to automatically label the images from the visual content as opposed to solely relying on just the human labels.

The use of transformer models in the majority of submissions showed the cutting-edge ingenuity in this community. The problem provided not just an opportunity for the students to show their skills but also a platform to present new and exciting approaches to the problems in dataset development. FactoryNet continues to be refined and developed and can be accessed by anyone curious to interact with the hackathon data at Zenodo.com.

Internally sponsored by AFRL, FactoryNet serves as a public open image dataset focusing on the manufacturing environment, accelerating digital manufacturing initiatives across the broader manufacturing community. FactoryNet’s primary motivation is to support the recent proliferation of vision algorithms within emerging technologies across manufacturing, including augmented reality and autonomous robot applications. The CIE Hackathon represents the first public announcement of FactoryNet. Stay tuned for updates as FactoryNet continues to mature.
 

You are now leaving ASME.org