Preventing artificial intelligence from taking on negative human traits.

FrumiousBandersnatch · May 15, 2021

sjastro said:
I was referring only to reinforcement learning.
The examples you provided are of unsupervised learning where data is used to train the algorithm.
I've mentioned the case of Amazon's recruitment algorithm discriminated against hiring women as the data was biased; the same problems can arise in crime predictions where the higher percentage of arrest and incarceration rates for Australian indigenous people or African Americans can lead to algorithms using racial profiling.
Predictive policing algorithms are racist. They need to be dismantled. – MIT Technology Review

OK; so does this mean reinforcement learning is inapplicable to such contexts?

SelfSim · May 15, 2021

sjastro said:
AlphaZero {AZ} and Leela Chess Zero {LCZ} were trained with an inductive bias but their games are not human like as I can attest to after being slaughtered by Leela Chess Zero.
Maia {M} on the other hand was supervised trained using human games and not only plays like a human but makes all those blunders we humans are capable of making.

Ok .. so may I attempt to recap my understanding, (from what you've highlighted for us), as follows:

i) AZ and LCZ were trained with an inductive bias, but their games are judged (by humans) to be 'not human-like';
ii) M was supervised trained and its games are judged (by humans) to be 'human like';
iii) Reinforcement learning, (AZ and LCZ), is argued as not involving human bias despite the incorporation of a reward system sub process;
iv) Conclusion is: Reinforcement learning is judged (by humans) as producing 'not human-like' performances, despite incorporation of any identified inductive biases;

Is this correct, so far? (Just trying to follow/understand what's being said here).

sjastro · May 15, 2021

FrumiousBandersnatch said:
OK; so does this mean reinforcement learning is inapplicable to such contexts?

My understanding this is the case.
Since reinforcement learning is data less, in the case of crime prevention the algorithm would need to know the difference between right and wrong or the more subtle ethical dilemmas such as when killing is justified as self defense.
I don’t think reinforcement learning has reached this degree of sophistication.
The data input supervised and unsupervised learning processes bypass such issues and as seen in this thread leads to other problems.

sjastro · May 15, 2021

SelfSim said:
Ok .. so may I attempt to recap my understanding, (from what you've highlighted for us), as follows:

i) AZ and LCZ were trained with an inductive bias, but their games are judged (by humans) to be 'not human-like';

Yes when human players sacrifice pawns and pieces in a game these are examples of tactical chess calculated over a relatively small number of moves ahead to either gain an immediate advantage or to checkmate the enemy King.

In the games between AZ and Stockfish, AZ sacrificed pawns for long term strategic advantages.
Not even Super Grandmasters with ELO ratings in the 2700-2850 range play this way as strategic chess is much more subtle and the human brain cannot perceive the strategic advantages gained when calculated many moves ahead.

ii) M was supervised trained and its games are judged (by humans) to be 'human like';

Correct.
Maia was also trained to play weaker chess in the 1100 – 1900 ELO range.
In games between players where their ELO ratings are <1700, the player who makes the fewer mistakes usually wins.

iii) Reinforcement learning, (AZ and LCZ), is argued as not involving human bias despite the incorporation of a reward system sub process;

Yes.

iv) Conclusion is: Reinforcement learning is judged (by humans) as producing 'not human-like' performances, despite incorporation of any identified inductive biases;

Yes because inductive bias is designed to speed up the reinforcement learning process without introducing a human bias into the playing style.
The amount of inductive bias can be reduced by increasing the number of self played games which was done by AZ and LCZ where self played games numbered in the tens of millions.

FrumiousBandersnatch · May 16, 2021

sjastro said:
My understanding this is the case.
Since reinforcement learning is data less, in the case of crime prevention the algorithm would need to know the difference between right and wrong or the more subtle ethical dilemmas such as when killing is justified as self defense.
I don’t think reinforcement learning has reached this degree of sophistication.
The data input supervised and unsupervised learning processes bypass such issues and as seen in this thread leads to other problems.

OK, thanks.

durangodawood · May 18, 2021

sjastro said:
...It states categorically there was no human involvement in the self training process.....

The process being the execution of some code that humans devised?

I dunno. I detect fingerprints all over the process.

Sure, the process goes places the humans couldnt forsee, as do many contraptions we invent. But humans made the darn process, so they cant just wash their hands of the whole thing.

sjastro · May 18, 2021

durangodawood said:
The process being the execution of some code that humans devised?

I dunno. I detect fingerprints all over the process.

Sure, the process goes places the humans couldnt forsee, as do many contraptions we invent. But humans made the darn process, so they cant just wash their hands of the whole thing.

Humans may have written the code but they didn't control the output.

The program;

for x=1 to 100
y=rnd(1)
print y
next

Randomly prints out 100 numbers between 0 and 1 which the programmer has no control over.

Similarly at the start of training AlphaZero and Leela Chess Zero were random move generators which taught itself the game through self playing without any human involvement unlike supervised learning.

SelfSim · May 18, 2021

sjastro said:
Humans may have written the code but they didn't control the output.

The program;

for x=1 to 100
y=rnd(1)
print y
next

Randomly prints out 100 numbers between 0 and 1 which the programmer has no control over.

Similarly at the start of training AlphaZero and Leela Chess Zero were random move generators which taught itself the game through self playing without any human involvement unlike supervised learning.

The logic, represented by the code, is a very human thinking brain attribute/style.
(I'm not yet sure about what implications this has, as far as 'bias' in AZ's and LCZ's learning style though ..?)

sjastro · May 18, 2021

SelfSim said:
The logic, represented by the code, is a very human thinking brain attribute/style.
(I'm not yet sure about what implications this has, as far as 'bias' in AZ's and LCZ's learning style though ..?)

It simply serves to illustrate the programmer has no control on the output which is purely random like in reinforced learning.

durangodawood · May 18, 2021

sjastro said:
Humans may have written the code but they didn't control the output.

The program;

for x=1 to 100
y=rnd(1)
print y
next

Randomly prints out 100 numbers between 0 and 1 which the programmer has no control over.

Similarly at the start of training AlphaZero and Leela Chess Zero were random move generators which taught itself the game through self playing without any human involvement unlike supervised learning.

Yes I understand that. But the code that executed all of it was devised by humans, for a purpose desired by humans. So to say there was "no human involvement in the self training process" just gives a fantastical impression of whats going on.

Desired the process, designed the process, built the process. That there is human involvement. All the humans didnt do was execute the process.

sjastro · May 18, 2021

durangodawood said:
Yes I understand that. But the code that executed all of it was devised by humans, for a purpose desired by humans. So to say there was "no human involvement in the self training process" just gives a fantastical impression of whats going on.

Desired the process, designed the process, built the process. That there is human involvement. All the humans didnt do was execute the process.

While humans designed and built the process the objective of reinforced learning is no human involvement as humans do not provide the data for learning, the program trains itself instead.

This is distinctly different from supervised and unsupervised learning.

SelfSim · May 18, 2021

sjastro said:
It simply serves to illustrate the programmer has no control on the output which is purely random like in reinforced learning.

Ok .. I agree.
I've gotta admit though, this one's got me scratching my head .. I'm not sure sure what to make of what's going on there.

I mean, its a demonstration of what happens when a free-wheeling, computer based learning process, 'driven' by pure logic, (meaning the code), is rigorously applied to a constrained set of rules, (meaning Chess), with a singular purpose (meaning to win). The surprise there, is that the end result ends up being not human-like.

Maybe its an example of how we would end up if we ever actually followed our own rules of rational (formal) logic and if we actually abandoned being driven by our own past experiences/'intuitions'?

sjastro · May 18, 2021

SelfSim said:
Ok .. I agree.
I've gotta admit though, this one's got me scratching my head .. I'm not sure sure what to make of what's going on there.

I mean, its a demonstration of what happens when a free-wheeling, computer based learning process, 'driven' by pure logic, (meaning the code), is rigorously applied to a constrained set of rules, (meaning Chess), with a singular purpose (meaning to win). The surprise there, is that the end result ends up being not human-like.

Maybe its an example of how we would end up if we ever actually followed our own rules of rational (formal) logic and if we actually abandoned being driven by our own past experiences/'intuitions'?

The proof of the pudding is in the eating.
In a previous post I mentioned human chess players will sacrifice pawns to gain a tactical or winning advantage.
Occasionally top human players will also make positional or strategic sacrifices which are called speculative sacrifices as an advantage is not clear cut and is based on factors such as gut feelings or an attempt to unsettle the opponent.

Analysis of the games against Stockfish, a vastly more formidable opponent than the very best human players, indicated AlphaZero made positional sacrifices which were 100% sound and not speculative at all.
It played at a level not only beyond human players but Stockfish as well indicating there was no human influence in its play.
Stockfish being programmed with human chess knowledge lost decisively.

So yes while humans developed AlphaZero, the output of AlphaZero which is how the program played chess was not human like.

sjastro · May 19, 2021

@SelfSim

I know you are a fan of the mathematician Marcus du Sautoy.
Here is his take on AI.

I recall a documentary a few years ago where du Sautoy an excellent chess player was pitted against a grandmaster (I don't recall his name) in an experiment to show grandmasters had a biological disposition to playing superior chess.
The experiment was to measure the gamma wave activity of each player.
It was found the gamma wave activity for the grandmaster was more in their frontal and parietal cortices where as for Sautoy it was in the medial temporal lobe.

It probably explains why the grandmaster destroyed du Sautoy since the frontal and parietal cortices of the brain are better at processing problems than the medial temporal lobe.

SelfSim · May 20, 2021

sjastro said:
@SelfSim

I know you are a fan of the mathematician Marcus du Sautoy.
Here is his take on AI.

Thanks for that .. .. Interesting .. (it'll take me a while to consume the full 30mins).

Its interesting that its duSautoy talking about AI doing Go and Chess .. and you're a Chess player, too.
Chess (and AI) obviously appeals to mathematicians!

sjastro said:
I recall a documentary a few years ago where du Sautoy an excellent chess player was pitted against a grandmaster (I don't recall his name) in an experiment to show grandmasters had a biological disposition to playing superior chess.
The experiment was to measure the gamma wave activity of each player.
It was found the gamma wave activity for the grandmaster was more in their frontal and parietal cortices where as for Sautoy it was in the medial temporal lobe.

It probably explains why the grandmaster destroyed du Sautoy since the frontal and parietal cortices of the brain are better at processing problems than the medial temporal lobe.

Well there ya go .. I'll have to conclude that mathematicians and Chess players have obviously let the more important brain centres shrivel .. and let the less important ones, become hideously bloated!(?)

Preventing artificial intelligence from taking on negative human traits.

Well-Known Member

A non "-ist"

Newbie

Newbie

Well-Known Member

re Member

Newbie

A non "-ist"

Newbie

re Member

Newbie

A non "-ist"

Newbie

Newbie

A non "-ist"

Similar threads

Privacy & Transparency

Privacy & Transparency