Library Artefact Digitisation Q&A
Deep dive into an AI project — featuring Pranav Kutty and Ryan Garg
Featuring Authors:
Pranav Kutty — Project Manager for Library Artefact Digitisation
Ryan Garg — Project Team Member for Library Artefact Digitisation
We had the opportunity to hear from Pranav Kutty, current project manager for the Library Artefact Digitisation project and Ryan Garg, who is a project member on this project. Both Pranav and Ryan share their insights from this project which involves the digitisation of real-world objects into viewable/interactive three dimensional models! This project is a collaboration with Monash Automation, another student engineering team, who are working on the creation of robots that can take videos and images of objects which will be digitised.
Can you tell me a little bit more about that project? As well as your aspect of the project collaboration?
Pranav — At the beginning of this project, the idea was that we would work with the Sir Louis Matheson Library to digitise some of the artefacts in their collection, so that they could have digital versions available on their website for the public to view. Since then, our project has evolved into pursuing the idea that we can create a framework to digitise any object. For those unfamiliar with digitisation, as the name suggests, it is the process of converting items into a digital form. In our context, we aim to convert objects into a digital 3D image, allowing users to scroll around and view all angles in a digital format as if they had the object in front of them. A great example of this is the George Washington artefact digitisation from the Smithsonian Museum, linked below.
This involves taking images or videos of the object and then training models on this data to produce a final digitised output. This project is in collaboration with Monash Automation, who are currently working on a robot that can take videos and images of the objects we want to digitise. We have been exploring the various existing digitisation processes and comparing their outputs after training to determine what works best for us.
As team lead, my role in this project is primarily centred around planning, organisation, and communication, as well as contributing to model training and research.
What technologies and methodologies are being employed in the development of this AI project?
Both — This project thus far has centered around two technologies: Neural Radiance Fields (NeRFs) and Gaussian Splatting. Both of these are machine learning methods of digitisation, and both are installed and run predominantly using the command line. At its essence, a NeRF functions as a neural network generating a 3-dimensional scene representation from 2-dimensional input images. However, within the realm of NeRFs, numerous variants exist, and our current focus lies in exploring these diverse implementations. Some of the NeRF variants we are delving into include Instant-NGP, Mip-NeRF, and PyNeRF.
Moving forward, we will be shifting our focus to the different models of Gaussian Splatting. The plan is to explore all available methods thoroughly, and once we have done so, we aim to develop our own model. By experimenting with and understanding the various approaches within NeRFs and Gaussian Splatting, we hope to contribute innovative solutions to the field of machine learning-based digitisation.
How do you envision this AI project making a positive impact or addressing real-world issues?
Ryan — Whilst at the moment we’re working on digitising arbitrary objects at Monash University, the technology of creating 3-dimensional scenes from sets of 2-dimensional images can have many real-world applications. For instance, there are studies being conducted on the effectiveness of rendering medical imagery such as MRI and ultrasound scans. Additionally, there are many applications in aerial imagery, and NeRFs can be used in professions such as urban planning.
Pranav — Given that we are attempting to produce an output for Matheson Library, and that they intend our work to be used on their website, it is fair to say that this project does have a tangible real world impact. I think the positive impact that this project has stems from the fact that it makes these artefacts and objects accessible without having to see them in person. It allows one to understand what something might look like in person, in a way that a simple 2D image would not be able to.
What were some key lessons learned by the team/or that you learnt from collaborating on this project?
Pranav — This being a new project and my first time as project manager, it was a massive learning curve. In our first meeting as a team we had a brief discussion about what we knew about object or artifact digitization and not one person had any background. As team lead this was a scary experience, given that I was expected to plan a project about something that I had never seen before. Thankfully during our meetings the team was able to help me with planning our next steps at each stage by voicing their opinions. The key lesson that I learned in that scenario was mainly to trust the people in my team and their judgement.
Ryan — A key lesson we learnt in this project is that there is no single correct technique or model to use for our task. Through our experimentation and research, we discovered that every model has its advantages and disadvantages, and it depends on the object we are attempting to digitise. We also learnt the importance of understanding the theory behind the methods we are using, as it makes it significantly easier to understand why a method may not be particularly effective for a specific use case, or why it might perform better than other methods.
Could you please give me three words to describe this project?
Pranav — Fresh, rewarding and ambitious.
Ryan — Innovative, challenging and multifaceted.
Interested in learning more?
Check out these external links:
3D Digitisation example of George Washington — credit towards the Smithsonian National Museum of American History
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained)
3D Gaussian Splatting — Explained!
AI Project Library Artefact Digitization current team:
Pranav Kutty, Ryan Garg, Lily Ung, Laura White
Special thanks to our editors:
Reya Jain and Regina Lu