I’ll base this question on a toy example.
Let this be table A
:
A

U  V  W  X  Y  Z

a  b  c  1  6  8.3
a  b  c  1  4  3.7
a  b  f  3  4  2.6
a  b  f  3  2  6.0
a  e  c  1  0  3.5
a  e  c  1  5  8.8
d  b  f  1  0  3.5
d  b  f  1  3  2.3
d  e  c  2  6  2.2
d  e  c  2  4  3.3
d  e  f  0  7  5.0
d  e  f  0  6  3.6
I can produce a second table B
by grouping the rows of A
by columns U
, V
, and W
, and computing the average of column Z for each group.
B

U  V  W  Z_avg

a  b  c  6.0
a  b  f  4.3
a  e  c  6.2
d  b  f  2.9
d  e  c  2.7
d  e  f  4.3
The SQL for this would be something like
SELECT U, V, W, AVG(Z) AS Z_avg FROM A GROUP BY U, V, W;
But I want the new table to include all the columns of the original table that have a functional dependence on the grouping columns U
, V
, and W
. In this example there is one such column, namely column X
.
In other words, I want to generate the table C
shown below:
C

U  V  W  X  Z_avg

a  b  c  1  6.0
a  b  f  3  4.3
a  e  c  1  6.2
d  b  f  1  2.9
d  e  c  2  2.7
d  e  f  0  4.3
So this problem has two parts, at least conceptually.

How to determine which columns are functionally dependent on
columns U
, V
, and W
?

What is the SQL to generate table C
?
I know how to implement a (say, Python) script that can answer (1), but it is tedious and slow. (Basically, for each of the candidate columns, in this case X
and Y
, the script would collect all of its values for each distinct combination of values in columns U
, V
, and Z
, and then, if each of these sets of values has exactly one element, then the column is functionally related to U
, V
, and Z
.)
Likewise, once I have identfied the functionally dependent columns, I can muddle may way through (using temporary tables and what not) to eventually end up with something like table C
above (thus, effectively solving (2)).
I figure, however, that this task is sufficiently common that there may be standard tools/techniques to carry it out.