AI Arguing Machines and AI Self-Driving Cars


By Lance Eliot, the AI Trends Insider

I’d like to argue with you. Ready?

No, I don’t think you are ready. What’s that, you say that you are ready. I don’t think so. You insist that you are ready to argue? Sorry, it doesn’t seem like you are ready.

If you’ve ever seen the now-classic Monty Python skit about arguments, my aforementioned attempt to argue with you might seem familiar. In the skit, a man goes to an argument “clinic” that allows you to pay money to argue with a professional arguer. At one point, the man seeking an argument gets upset that the arguer is merely acting in a contradictory way and not truly providing an argument. They then argue about whether contradiction itself is a valid form of arguing, which the man paying for the argument insists is a hollow form of arguing and not intellectually in the spirit of a true argument.

It is a rather clever and memorable skit.

I bring up the notion of arguing and arguments due to the emerging approach of using “arguing machines” in the realm of AI. I’d like to share with you various facets of the arguing machines approach and also establish how it relates to AI self-driving cars.

Take a look at Figure 1.

I’ve depicted a maturity matrix framework for AI Arguing Machines, which I’ll be covering herein. The matrix encompasses some of the key criteria or characteristics that distinguish between various AI Arguing Machines approaches. The maturity levels range from lowest (Level C) to highest (Level A). The lowest level is considered an Initial approach (Level C), the next level is labeled as Capable (Level B), and the highest or most extensive approach is referred to as the Master level (Level A).

In a prior column, I had detailed various aspects about fail-safe AI and mentioned that to try and have AI that can be somewhat fail-safe (i.e., safe to fail), you often will have a primary AI system and a secondary AI system. The primary AI system is let’s say assigned to be the key runner of whatever is taking place, while the secondary AI system is typically there as a back-up in case the primary AI somehow fails or suspects that it is failing.

For my article about fail-safe AI, see:

For explanation-AI Machine Learning, see my article:

For the need to have resiliency in AI systems, see my article:

You might setup the primary AI to do a kind of self-diagnosis and if it begins to believe that it is having internal troubles, it might handover the AI tasks at-hand to the secondary AI. There could be an ongoing handshake taking place between the primary AI and the secondary AI, and at some point the primary AI can turn over the effort to the secondary AI, or, if the primary AI does not on a timely basis perform the handshake you could have the secondary AI assume that the primary AI is in trouble and thus forcibly take over the task from the primary AI.

There are a variety of means to setup the relationship and nature of which should be running, whether the primary AI should be, or the secondary AI should be.

In my fail-safe AI article, I had also pointed out that we could use a “Chooser” that would try to decide which of the AI instances should be used.

We might have more than just a primary AI and a secondary AI, in that we might have some N number of AI instances, all of which are able to perform the same tasks, and all of which can be potentially chosen to run the task at-hand. We’ll refer to the primary AI as the instance 1, and the secondary as instance 2, and so on, all of which are actively executing and keeping up with the tasks underway. Thus, the Chooser can select from any of them, under the belief that they are all ready and active to step into the role of being the actual “primary” AI undertaking the task at-hand.

Typically, there is a “switching cost” involved in turning over execution to another of the active instances, therefore the Chooser needs to be cautious of opting to arbitrarily just turn over the task to any of the other available and viable AI instances. To switch over to another AI instance might introduce latency and if the AI is performing a real-time task, the matter of time is crucial to whatever chore is being performed.

For my article about cognitive AI timing aspects, see:

For more about Machine Learning (ML) and AI self-driving cars, see my article:

For human-aided training of deep reinforcement learning for AI self-driving cars, se my article:

For AI algorithmic transparency aspects, see my article:

By-and-large, the Chooser will be tending toward allowing an AI instance to become the primary and stick with it, until or if there is a circumstance that suggests the AI instance is no longer able to properly undertake the chores.

Thrashing about by continually switching from one AI instance to another would be likely problematic. Plus, presumably once the Chooser has switched away from an AI instance, it is likely due to a failure of the AI instance, and resuming that AI instance might be dicey since it might no longer be adequately able to perform the tasks at-hand.

The Chooser aka Arbitrator or Judge

How might the Chooser decide which of the available and active AI instances is a suitable choice?

A simplistic method involves having an AI instance provide a status indicator and as long as the status indicator seems to be positive or good, the Chooser can select that AI instance. Thus, we might begin the task with AI instance 1 (known as the primary AI at that juncture), and the Chooser is keeping tabs on the status indicators of the AI instances of 1 to N. If the current primary AI indicates a “bad” status indication, the Chooser could then select any of the remaining AI instances that are still reporting a positive or “good” indication.

The selection might not be done in a random manner, and instead it could be that the AI instances have been lined-up in a sequence that is preferred for selection. In that case, the AI instance 1, when no longer seemingly good, the Chooser would switch over the tasks to AI instance 2. AI instance 2 then becomes the “primary” AI for the moment. If it goes to a “bad” indicator status, the Chooser would switch things over to AI instance 3. And so on.

Rather than having each of the AI instances report a status indicator, another approach would be to have each instance present what it is intending to do next.

For example, suppose the task involves steering an AI self-driving car. The current primary AI might indicate that the next act of steering is to involve having the steering wheel angle turn to degree X. Meanwhile, the secondary AI might indicate that the steering wheel angle should be the degree Y. We’ll use just the two AI instances in this example for now, though as I’ve mentioned there could be N number of them.

The Chooser could inspect the primary AI’s indication of X and compare it to the secondary AI’s indication of Y. If the X and Y are approximately equal, the Chooser might consider this as a sign of agreement between the primary AI and the secondary AI. Since there is agreement, the Chooser has no need to switch from the primary AI and allows the primary AI to proceed.

Notice that I mentioned that the X and Y are considered “in agreement” if they are approximately equal. I say this because the primary AI and the secondary AI might have different methods of ascertaining the steering wheel angles and thus, they might differ somewhat about the particular amount. If the difference is considered miniscule and not substantive, it is not worth the effort to declare a disagreement. The amount of variance allowed between the two would be dependent upon the nature of the element being used and the Chooser would need to have some pre-defined basis for declaring that a difference was substantive.

Let’s suppose that the X and Y are indeed substantively different, and the Chooser now needs to decide whether to continue with the primary AI that provided the X or to switch over to using the secondary AI that provided the Y.

We have returned to the question about how the Chooser is to make a choice between the AI instances. When we had the simplistic approach of a status indicator, it was easy for the Chooser because all it had to do was monitor the status indicators and the moment that the active AI went to a “bad” indicative status it meant that the Chooser should switch over the task to another AI instance.

With this X and Y matter of having a substantive difference between X and Y, should the Chooser assume that the secondary AI is the “right one” and the primary AI is the “wrong one” (meaning that the correct angle to be used is Y and not X). But, it could be that the secondary AI is the “wrong one” and the primary AI is the “right one.” Of course, we could decide beforehand that any time that a substantive difference arises, whichever AI is primary must be mistaken and therefore switch over to another instance, though this logic would usually seem suspect.

We could require that the Chooser be more sophisticated and be able to weigh-in about which of the X or Y is the better choice. If the Chooser then ascertained that the X is the better choice, it would presumably continue with the running of the primary AI, while if the Y is the better choice then the Chooser would presumably switch over to the secondary AI.

You might think of this Chooser as a kind of judge in a court of law. There are two parties in the courtroom, the primary AI and the secondary AI. They are standing before the court. One professes that the X is the proper choice. The other professes that Y is the proper choice. The judge is able to determine that X and Y are substantively different. Should the judge opt to say that the primary AI is right and should proceed, or should the judge opt to say that the secondary AI is right and it should proceed instead of the primary AI?

If the judge happens to know a lot about the nature of the task and the meaning of the X and Y, it could be that the judge is sophisticated enough to be able to make the choice without any other input, other than the X and Y that is being presented.

The rub is that the more sophisticated the Chooser becomes (the judge), the more chances of it substituting its judgement in lieu of the presumed more elaborated capabilities of the primary AI and the secondary AI. In that sense, we’re almost then placing the Chooser into the role of being another AI instance that has to be equivalent to the other AI instances involved in performing the task at-hand, but is that what we intend to have happen?

Instead, we might allow for an argument to occur between the primary AI and the secondary AI, of which by presenting the argument to the Chooser, it can base a selection of either the primary AI or the secondary AI based on the greater of the prevailing arguments. We could then refer to the Chooser as a judge or an arbitrator. Rather than making a choice based on a simplistic status indicator, and rather than making a somewhat random choice if the X and Y produced are substantively different, we might have the Chooser act as a judge or arbitrator that entertains respective arguments from each party and uses those arguments as part of the criteria for making a switching decision.

In short, here are our rules:

  •         If the primary AI and the secondary AI are in agreement about the next action, which could be ascertained by comparing their next action choices of X and Y respectively and allowing that they are in agreement if the difference is considered negligible with respect to whatever has been predefined as a non-substantive difference, the Chooser (aka judge or arbitrator) allows the primary AI to continue ahead.
  •         If the primary AI and the secondary AI are in disagreement about the next action, which was ascertained by the detection of a substantive difference in their next action choices of X and Y respectively, the Chooser (aka judge or arbitrator) requests and inspects arguments from the primary AI and the secondary AI to then make a choice as to whether to continue ahead with the primary AI or to switch to the secondary AI.
  •         These above rules can be recast in the matter of having some N instances of the AI system and thus is not limited to just having two AI systems in-hand.

Assuming that the AI system is an integral part of a real-time system, and for which the real-time system involves life-and-death matters such as those involved in the driving task of an AI self-driving car, the Chooser must be streamlined to handle the assessment of the arguments, and too the AI instances must be able to rapidly present their arguments.

Timing Crucial to Making Choices

I mention this timing aspect because we are adding overhead to the whole mechanism of what is taking place for choosing the next action. Any delay in executing the next action could have dire consequences. We would need to figure out the appropriate timing allowed for sufficient time to make the final judgment or choice.

A fail-safe on the fail-safe AI might be to have a choice made on some predefined basis that if time runs out while trying to make a choice, any of the available choices is considered “better” than making no choice at all (it’s like a chess match in which the timer runs out for you to make a choice, and you are torn about which move to make, but you forfeit the match if you don’t make a move, so you then select whatever move seems viable and hope for the best). Or, some other more informed choice mechanism might be employed.

Let’s consider the timing aspects. The primary AI needs to figure out and present its argument. The secondary AI needs to figure out and present its argument. The Chooser needs to wait for and then assess the two respective arguments. Once the Chooser has made its choice, we are then at the point of dealing with the switchover costs, as alluded to earlier, assuming that the Chooser opts to switch AI instances after conducting the assessment and making a judgement.

I realize on the surface that the overhead of preparing and presenting arguments would seem at first glance as not viable and actually somewhat crazy to consider. Do keep in mind though that it would be a canned kind of affair. It would be highly questionable if the AI systems were not well prepared to undertake this set of steps about trying to prove or verify their choices. Instead, the AI system and each of the instances is purposely wired-up to perform this task and thus the AI developers would beforehand have tuned it to try and ensure it runs in speedy time.

I used the analogy of a judge in a courtroom. I don’t want you to overinflate that analogy. We are used to seeing courtroom dramas that involve quite elaborate arguments and the judge has to keep the parties focused on the case. The AI mechanism for the arguing machines is not going to allow for proclaimed objections by the parties and nor any kind of back-and-forth to try and sway the case. I suppose you might think of this as a cut-and-dried kind of courtroom case, maybe like a streamlined and tightly run Small Claims court. You walk into court, tersely present your case, the judge slams down the gavel and makes a quick choice. On to the next case!

That deals with the timing aspects.

Suppose though that the judge doesn’t show-up for the case? This would be equivalent to having the Chooser itself become at fault and perhaps be unable to participate in the arguing machines debate.

In fact, it is dicey that we are setting up a system involving a Single Point of Failure (SPOF), namely that if the Chooser or judge or arbitrator of the AI systems is unable to perform (and we only have one of them), we then have an untoward situation. We might get more elaborate and have multiple Choosers. Or, we might decide that the primary AI will continue unabated and only if it decides to switch over would then the secondary AI take the reins, in the scenario of a Chooser that is absent from the game. And so on.

I’ll also mention that if the Chooser or judge or arbitrator shows-up but goes nutty, we still have a problem on our hands and indeed the situation becomes somewhat more troubling than if not being present at all. How will the AI systems be able to realize that the Chooser has gone wild? Again, there would need to be a further fail-safe devised for this scenario.

For my article about the safety aspects of AI self-driving cars, see:

For the reframing of AI levels for self-driving cars, see my article:

For Federated Machine Learning, see my article:

For Ensemble Machine Learning, see my article:

Impacts for AI Self-Driving Cars

What does this have to do with AI self-driving cars?

At the Cybernetic AI Self-Driving Car Institute, we are developing AI software for self-driving cars. One emerging approach that some AI developers are pursing involves incorporating the use of AI arguing machines into their systems, and for which this is pertinent to AI self-driving car systems too.

Allow me to elaborate.

I’d like to first clarify and introduce the notion that there are varying levels of AI self-driving cars. The topmost level is considered Level 5. A Level 5 self-driving car is one that is being driven by the AI and there is no human driver involved. For the design of Level 5 self-driving cars, the auto makers are even removing the gas pedal, brake pedal, and steering wheel, since those are contraptions used by human drivers. The Level 5 self-driving car is not being driven by a human and nor is there an expectation that a human driver will be present in the self-driving car. It’s all on the shoulders of the AI to drive the car.

For self-driving cars less than a Level 5, there must be a human driver present in the car. The human driver is currently considered the responsible party for the acts of the car. The AI and the human driver are co-sharing the driving task. In spite of this co-sharing, the human is supposed to remain fully immersed into the driving task and be ready at all times to perform the driving task. I’ve repeatedly warned about the dangers of this co-sharing arrangement and predicted it will produce many untoward results.

For my overall framework about AI self-driving cars, see my article:

For the levels of self-driving cars, see my article:

For why AI Level 5 self-driving cars are like a moonshot, see my article:

For the dangers of co-sharing the driving task, see my article:

Let’s focus herein on the true Level 5 self-driving car. Much of the comments apply to the less than Level 5 self-driving cars too, but the fully autonomous AI self-driving car will receive the most attention in this discussion.

Here’s the usual steps involved in the AI driving task:

  •         Sensor data collection and interpretation
  •         Sensor fusion
  •         Virtual world model updating
  •         AI action planning
  •         Car controls command issuance

Another key aspect of AI self-driving cars is that they will be driving on our roadways in the midst of human driven cars too. There are some pundits of AI self-driving cars that continually refer to a utopian world in which there are only AI self-driving cars on the public roads. Currently there are about 250+ million conventional cars in the United States alone, and those cars are not going to magically disappear or become true Level 5 AI self-driving cars overnight.

Indeed, the use of human driven cars will last for many years, likely many decades, and the advent of AI self-driving cars will occur while there are still human driven cars on the roads. This is a crucial point since this means that the AI of self-driving cars needs to be able to contend with not just other AI self-driving cars, but also contend with human driven cars. It is easy to envision a simplistic and rather unrealistic world in which all AI self-driving cars are politely interacting with each other and being civil about roadway interactions. That’s not what is going to be happening for the foreseeable future. AI self-driving cars and human driven cars will need to be able to cope with each other.

For my article about the grand convergence that has led us to this moment in time, see:

See my article about the ethical dilemmas facing AI self-driving cars:

For potential regulations about AI self-driving cars, see my article:

For my predictions about AI self-driving cars for the 2020s, 2030s, and 2040s, see my article:

Returning to the topic of AI arguing machines, I’d like to discuss a recent study done by MIT on the matter and that showcases various facets of this emerging approach.

The MIT study entitled “Arguing Machines: Human Supervision of Black Box AI Systems That Make Life-Critical Decisions” (co-authored by Lex Fridman, Li Ding, Benedikt Jenik, Bryan Reimer,, provides a quite interesting study of the use of arguing machines. Two use cases are considered, one involving an image classification problem, and the other involving a self-driving car matter.

MIT Study on Arguing Machines

Similar to my earlier herein remarks about arguing machines, the MIT study posits that a crucial challenge for making life-critical system-based decisions encompasses those moments when there is a small margin of allowable error in real-world applications entailing human lives. They use a system arbitrator to ascertain whether there is agreement or disagreement between a primary AI and a secondary AI.

They opted to use an image classification use case due to the aspect that image classification is one of the most studied problems in AI, and if their arguing machines approach could aid or enhance image classification, it would well suggest that the technique can be advantageous to a wide array of AI areas. In essence, if you can improve something that has already had everything but the kitchen sink thrown at it, the odds are that you’ve got a techniques or approach that could help in many other circumstances, including ones that are not nearly as well solved or studied.

For the image classification experiment, they used the popular ResNet image classifier as the primary AI and the also equally popular VGG as the secondary AI. Images were chosen from the ImageNet Dataset.

Note that they two AI systems are entirely different from each other and are not somehow clones or near duplicates. I mention this aspect because there are some that at times assume that the primary AI and the secondary AI would be near copies of one another, but this is not necessarily the case. They can each be devised to try and solve the same problem, likely meeting the same or equivalent set of requirements, and yet make use of completely different algorithms and methods at arriving at their efforts.

The AI instances involved in a fail-safe AI operation can each be designed and developed by separate teams. The instances could have been developed at different times. They might differ in the programming languages used and the AI techniques included. What is key would be that the AI instances are performing the same set of tasks and therefore each in their own “black box” way is trying to accomplish the same tasks at-hand. If that were not the case and they were doing varying tasks, it would no longer be the case that they would be considered straight-ahead failovers for each other.

One particularly fascinating aspect about the experiment was that the both ResNet and VGG make use of deep convolutional neural networks, and both had been trained on the same dataset, and yet they did not “fail” in the same manner during the experimental testing. In other words, you might at first assume that they would both agree almost entirely, and even if they were wrong, they would both be wrong at the same time. This was not the case. As the researchers pointed out, the result suggests and provides further impetus to have other means to gauge the outcomes of AI systems, such as the use of arguing machines.

Furthermore, the arguing machines approach they utilized was able to do an admirable job in terms of improving the image classification performance. This highlights that the act of having two AI systems and monitoring their disagreements can be a significant booster towards reducing errors and aiding performance.

In the second use case, the researchers used again an argument arbitrator, which they referred to also as a disagreement function, and focused on the automated lane-keeping feature of the Tesla Autopilot. Included in their experiment was a dataset of 6,500 disengagements, which are circumstances wherein a human back-up operator opted to disengage the Autopilot. They developed a 5-layered convolutional neural network which would serve as the secondary AI. The primary AI was the Tesla Autopilot software. The researches also devised two key metrics, a False Accept Rate (FAR) and a False Reject Rate (FRR), enabling them to compare the performance of the disagreement system.

For aspects about convolutional neural networks, see my article:

For my caveats about the use of human back-up driver operators, see my article:

For my forensic analysis of the Uber incident in Arizona, see:

For my article about the dangers of irreproducibility, see:

For the nature of uncertainty and probabilities in AI self-driving cars, see my article:

Impressively, they further explored the approach by constructing a real-time version that ran while inside an in-motion Tesla car. They used a NVIDIA Jeston TX2 for running of the model, developed and installed a custom interface to connect with the CAN bus of the car, put in place a dashboard-mounted Logitech C920 camera, and used OpenCV for the live video streaming aspects. The devised neural network made use of PyTorch.

I had earlier mentioned the importance of figuring out the timing aspects, and in this case they were able to get the latency down to 200 milliseconds or less, counting from the time of the camera input to the point of a screen GUI update display with the result.

The output factor consisted of the steering angle.

The researchers indicated they used this setup even in evening rush hour traffic, which assuming this took place in Boston, I’d rate their evening rush hour as snarled and messed-up as the traffic here in Los Angeles (therefore presenting a rather invigorating challenge!).

The MIT study is a great example of the potential for AI arguing machines and provides a handy launching pad from which future work can build upon.

Expansion of the AI Arguing Machines Framework

One aspect about the study was that it included the use of a human supervisor as integral to the experimental setup. The machine-based arbitrator consults with a human supervisor. This squarely puts the human-in-the-loop. For AI self-driving cars less than Level 5, this showcases that there is an opportunity to enhance existing in-car automation to aid the human driver in the undertaking of the driving task.

Also, the arguing machines were providing solely their output for purposes of ascertaining agreement versus disagreement. This is akin to the X and Y outputs that I’ve mentioned earlier herein.

Some setups for arguing machines do not have the machines actually carrying out an argument per se. In a manner similar to the famous Monty Python skit, the primary AI and the secondary AI are actually just differing in their outputs, perhaps you might say they are contradicting each other. Does the use of contradiction alone suffice to be an argument? The skit asked that same question.

In the case of true Level 5 self-driving cars, we are pursuing the circumstance of no human supervisor or human intervention to participate in the arbitration, along with the outputs of the AI instances being more robust, including too the presentation of their “arguments” for why their output should be considered the better of the choices provided to the arbitrator.

Removing the human supervisor accommodates the intent of true Level 5 self-driving cars. Making use of a more elaborated form of argument, beyond just the act of contradiction, will hopefully allow the arbitrator to make a more informed choice, doing so in lieu of resorting to asking a human to get involved.

When there is a disagreement, the more compelling argument will be given the nod as to the “winner” of which has the better output. There are various ways to quantify the measure of a compelling versus less compelling argument, which in our case involves the assignment of probabilities and uncertainties. This has to be done in a very time sensitive manner, and thus we are using templated structures that cover the main acts that the self-driving car would undertake while performing various driving tasks.

For the Turing test and AI, see my article:

For my article about AI as a potential Frankenstein, see:

For the potential coming singularity of AI, see my article:

For idealism about AI, see my article:


The use of AI arguing machines is a handy approach for dealing with real-time AI systems and provides a means to seek fail-safe operations involving life-and-death matters for humans. This is a somewhat newer and less-explored area of AI and still needs a lot of additional systematic study before it will be fully ready for prime time. Nonetheless, I don’t think there is much argument that AI arguing machines are well worth the attention of AI researchers and AI developers. Yes, it is. And if you say that no it is not, I’ll repeat again that yes it is. And so on, we will go.

Copyright 2018 Dr. Lance Eliot

This content is originally posted on AI Trends.