COGNITIVE WORLD

View Original

Multi-Party Computation: Advisor as a Service

IMAGE: DEPOSITPHOTOS

With the release of ChatGPT, I thought I might finally have an answer to one of the few remaining unanswered questions in economics: is there such a thing as a free lunch? After fasting through midday, I was left wondering what ChatGPT might otherwise be good for, more particularly, what meaningful data could it provide as input to my AI algorithms. 

Social media can be thought of as “Confirmation Bias as a Service.” It is architected to understand what confirms my beliefs and it provides that to me. It generally makes me feel good as a person, but my AI algorithms are less moved.

Search is closer to “Agenda Bias as a Service.” Search engines organize responses based on a ranking of profitable suggestions. 

ChatGPT is more like “Group Think as a Service.”  If I say: “Twinkle Twinkle,” it responds with: “Little Star,” as that is the “mode” of the large language library. 

I wonder how ChatGPT would have responded to large language repositories over time. With a few liberties on my part, let’s assume ChatGPT had followed the history of distributed information, was fluent in many languages, and sought out events in large language history. We will start after the library of Alexandria burned in 48 BC. 

ChatGPT version 0.1, would have been delighted with the invention of paper by China’s Cai Lun in roughly 100 AD. Coincidentally, the wheelbarrow was also invented in China shortly after. ChatGPT would have been trained on the Chinese teachings of Confucius, but without paper available to the rest of the world, it would have missed the Kama Sutra from India, the scientific writings from Ptolemy, and the final Christian Biblical Book: Revelations. 

If ChatGPT v0.2 were available in the 800’s AD, it would have been trained in the largest libraries on earth in the middle east, where it would have had an opportunity to leverage the “al” terms of algorithm and algebra. 

If ChatGPT v0.3 were available in the 1200’s AD when the university was created by Thomas Aquinas, it would have added language models based on Summa Theologica.

ChatGPT v0.4 would have been thrilled to have the printing press in the 1400’s AD in Europe which increased literacy in Europe and it likely would have changed its conclusions to questions regarding the shape of the earth. 

ChatGPT v0.5-0.9 would have included some of the great libraries in history, and thus their perspectives of the world. This would include the British Library, the Library of Congress in the U.S., Shanghai and Beijing Libraries, Russian State Library, and National Diet Library in Tokyo.

One advantage of ChatGPT is that the learning from the past great moments in large language history would maintain in its long term memory, even as it updated itself on scientific and philosophical corrections.

Tim Berners-Lee’s creation of the World Wide Web, while a blow to ChatGPT’s frequent flier account, made large language modeling much more feasible. We now have an improved tool for collecting data corresponding to group think. We will call today’s version 1.0. We all hope today’s ChatGPT version 2.X learns Russian, Chinese, Japanese, and many other languages.

Both social media and ChatGPT can have a role in providing training data for AI algorithms.  However, what I also need data from is a trusted advisor – a source that has aligned interests, provides some group insights, and can get me data I can’t get anywhere else. MPC offers this.

The Hello World example for MPC is as follows. A group is sitting around a table and wants to determine the average salary of the players without anyone sharing their actual salary. 

It works something like this: There are four people at a table. Say Person A has a salary of SA. Each player generates three random numbers and then calculates a fourth number so that the average of the four numbers is their salary Si. Each party distributes the three randomly generated numbers to the other three parties secretly. Each party now has 3 random numbers, one from each party, as well as the number they held back. They now average the four numbers and broadcast them to each other. The overall average is the table’s average salary.

In the table below, parties A-D have private salaries of 100, 80, 70, and 100, which average 90. Party A generates random numbers -20, 220, -60, and calculates 260 so that (-20 + 220 -60 + 260 = 400) and thus an average of 100. Similarly for the others:

After distribution, the parties have the following numbers:

The final numbers distributed are 105, 5, 122.5, and 127.5, which average 90, the overall average.  Hello World, the average salary is 90.

If one can get an average value from a group you believe is inclined to cooperate and be honest, then there are many applications for this. For example, consider a group of companies who have a bond (CUSIP) that they price privately, but are curious if their prices are generally in the ballpark with their competitors. Likely they too also would like to know if they are consistently too high or too low. There is no accessible market or perhaps companies don’t want to or can’t share their actual prices. By sampling say 30 CUSIPS, performing MPC to calculate the average price, a company can determine if they are above or below the average. 

Of course, if one can calculate an average, then they can also create a standard deviation as well (and further moments of skewness or kurtosis) if desired. With a few more iterations, one can use MPC to calculate minimum and maximum as well.

The value of MPC is that it allows one to ask aggregate questions from those trusted, yet protect the individual sources of data. It creates an anonymous market and provides a different source of data for AI to chew on. 

Consider 10 agencies and one agency wonders if they have a compromised compiler (Ken Thompson attack). An anonymous market could be spun up which answers the question: “Given environment E, compiler version C, and package P, what is the average of the hash of the compiled program?” The agencies could enter into an MPC effort, and if the average is the same as what the individual agency has calculated, then the conclusion is that they all have the same compiled code. If it is not, then at least one is different, though no one would know who is compromised. With that information, one could repeat the effort with different groups or subgroups of the original group or pursue confidence via other channels.

For the past 20+ years, universities and consultants have pushed for analytics, describing data as the new oil and recommending best practices for data management. More recently, universities and consulting firms have pushed for blockchain, citing the value of shared immutable data. These seem to be in conflict. That has made blockchain a hard sell to many organizations (I won’t name names). 

MPC is a compromise. For example, instead of sharing data specifics for an advanced shipping notice, questions can be asked such as “Do we all agree when the next delivery will be?” Parties in such a supply chain can perform a MPC to calculate the average date they have in their systems, and if the result is the same as Party A expects, then the date is confirmed. No need to share the entire EDI 856 data package. 

AI has less work to do if it is given good data. Social data and search results provide easy to access data, but can get a person in trouble if, for example, the result suggests that women in their 50’s are past their prime. Large Language Models used by tools like ChatGPT result in group think, which is a regression to the mean pattern. It too can have value, but perhaps innovation might better come if one could ask ChatGPT to move away from the mean. It would be even more interesting if it could give a response such as “How might someone from Russia, Saudi Arabia, or Nigeria answer this question?” – a group think perspective from a different language corpus. We might find quite a bit in common.

Data sourced from MPC can serve as a trusted advisor, and data that doesn’t exist anywhere else. It is the response to aggregate questions, but I can think of 100 of them that would provide shared value while keeping source data private. Perhaps we could even run our own MPC to ask readers if they would like such a list? Questions with binary responses would return an average, such as 0.82, which would indicate 82% of our readers would like such a list, all without anyone knowing who is on which side of the “election.”

Perhaps AI thinks it is getting a free lunch?  The cost of MPC is in the network timing. For a sum with 1000 participants, a party would have to generate 999 messages to be sent peer-to-peer resulting in roughly 1M overall messages being sent. Depending on the value of the response, and the bias in the selection of the parties to begin with, it may or may not be worth the effort. 

Anonymous markets can provide insights where actual markets can’t exist. The CDC might be curious if there is an increase in a disease, but individual doctors might be uncomfortable sharing patient data. MPC would allow anonymous signals to be shared while maintaining patient privacy. A company may have found a vulnerability in a software package and wants to alert others but doesn’t want anyone to know they are using that package. They could ask for a risk rating from peers along the lines of the Mitre CVE rating, and the MPC could return the maximum risk rating from the peers. The group would be alerted that a package was risky without anyone knowing who identified the vulnerability.

I’ll wait for the readership to respond with an MPC-based anonymous survey before I offer more questions, but I invite you to come up with your own list. It is another form of data for ingestion for AI, but the data might have particular value as you control the source selection bias.


Daniel Conway, Ph.D., is an analytics and AI journeyman. His Ph.D. is in Operations Research from Indiana University and has served on the faculty of Notre Dame, Indiana, Northwestern, Iowa, Florida, Virginia Tech, South Florida, and Arkansas. Dan serves on several blockchain standards boards and advises several blockchain startups. In addition to university faculty roles, he has consulted for dozens of companies, including Cisco Systems, Deere, AIG, Consulate Healthcare, and other organizations with interest in information security.