If you look at the picture, you'll see that it's almost symmetrical. Every time the wave above the line goes up, the wave below it goes down. The net effect is for the audio wave to cancel itself out and, if this signal were to be fed into an audio amplifier, nothing would be heard.
All of the audio information is actually contained in just the upper half or just the lower half of the wave and, as the 10.7 MHz 'carrier wave' has now done its job, we really should strip that away and remove the lower half of the wave. The devices shown in the bottom picture (called 'diodes') do just that.