sox -t .wav "|arecord -d 2" -n stat
With -t .wav we specify we process the wav type, "|arecord -d 2" executes the arecord program for two seconds, -n outputs to the null file and with stat we specify we want statistics.
The output of this command, on my system with some background speech, is:
Recording WAVE 'stdin' : Unsigned 8 bit, Rate 8000 Hz, Mono
Samples read: 16000
Length (seconds): 2.000000
Scaled by: 2147483647.0
Maximum amplitude: 0.312500
Minimum amplitude: -0.421875
Midline amplitude: -0.054688
Mean norm: 0.046831
Mean amplitude: -0.000044
RMS amplitude: 0.068383
Maximum delta: 0.414063
Minimum delta: 0.000000
Mean delta: 0.021912
RMS delta: 0.036752
Rough frequency: 684
Volume adjustment: 2.370
The maximum amplitude can then be extracted via:
grep -e "RMS.*amplitude" | tr -d ' ' | cut -d ':' -f 2
We grep for the line we want, use tr to trim away the space characters and then cut it by the : character and take the second part which gives us 0.068383 in this example. As suggested by comments, RMS is a better measure of energy than maximum amplitude.
You can finally use bc on the result to compare floating-point values from the command-line:
if (( $(echo "$value > $threshold" | bc -l) )) ; # ...
If you build a loop (see Bash examples) that calls sleep for 1 minute, tests the volume, and then repeats, you can leave it running in the background. The last step is to add it to the init scripts or service files (depending on your OS / distro), such that you do not even have to launch it manually.