Sybase ASE - AVG Function Error

Perlbaby · February 22, 2017, 9:45am

Hi Team -
I am using Sybase ASE 15.7 version. Below query is throwing an error
stating Error : incorrect syntax near the keyword 'OVER'

SELECT 
     EMPLOYEE_ID ,
     EMPLOYEE ,
     Department,
     CAST( Salary as DECIMAL( 10, 2 ) )  AS Salary 
  CAST(AVG( Salary) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) )          AS Average,
     CAST(STDDEV_POP( Salary )   OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) )     AS StandardDeviation
    FROM EMP_Table
    GROUP BY EMPLOYEE_ID,EMPLOYEE, Department, Salary

Please help me understand what is missed here ? Any solution appreciated.

Corona688 · February 22, 2017, 9:49am

What do you mean by "over"? Divided by? Perhaps just a /

Perlbaby · February 23, 2017, 3:30am

Thanks Corona688 for the reply .

The following query was tried to returns a result set that shows the employees whose salary is one standard deviation greater than the average salary of their department.

So the below OVER was used for partitioning based on department value. For some reason , its not working in sybase ASE and throwing error .

  CAST(AVG( Salary) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) )          AS Average,
     CAST(STDDEV_POP( Salary )   OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) )     AS StandardDeviation

please guide if this can be tried in different way or am i missing anything in above query . Thanks in advance.

durden_tyler · March 2, 2017, 10:33am

perlbaby:

...
The following query was tried to returns a result set that shows the employees whose salary is one standard deviation greater than the average salary of their department.

So the below OVER was used for partitioning based on department value. For some reason , its not working in sybase ASE and throwing error .
  CAST(AVG( Salary) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) )          AS Average,
   CAST(STDDEV_POP( Salary )   OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) )     AS StandardDeviation
please guide if this can be tried in different way or am i missing anything in above query . ....

The syntax consisting of "over ... partition by ..." etc. belongs to a special class of database functions called "window functions" or "analytic functions".
These were defined in the ANSI/ISO SQL 2003 version and implemented by a few databases like Oracle, SQL Server, PostgreSQL, DB2 etc.
Sybase ASE 15.7 is not one of those databases; it does not support window functions, hence the error message.
My guess is that you got that syntax from a place dedicated to a database other than Sybase ASE 15.7

Of course, it can be done without window functions but it will be inefficient.
The first thing you need to keep in mind is that each aggregate function has its own window function counterpart.
So:
=> the "avg" aggregate function returns the average for each distinct set of group-by columns for all the rows satisfied by the filter criteria. The number of rows returned is typically less than that satisfied by the filter criteria.
=> the "avg" window function returns the average for the set of partition-by columns for all the rows satisfied by the filter criteria. The number of rows returned is always equal to that satisfied by the filter criteria.

Here's an example using Oracle 11g; I don't have Sybase ASE at the moment.

SQL> -- My data looks like the following
SQL> select * from emp;
  
     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10
  
 14 rows selected.
  
SQL>
SQL> -- The "AVG" function used as an aggregate function. Notice that only 3 rows are returned,
SQL> -- although I did not use any filter criteria i.e. no "WHERE" and "AND" clauses.
SQL> select deptno, avg(sal) as avg_sal_agg from emp group by deptno order by deptno;
  
    DEPTNO AVG_SAL_AGG
---------- -----------
        10  2916.66667
        20        2175
        30  1566.66667
  
 3 rows selected.
  
SQL>
SQL> -- The "AVG" function used as a window function. Notice that all 14 rows are returned, and the
SQL> -- average value is the same that you'd expect for each department (as per the query above). The 14 rows
SQL> -- were returned because I did not use any filter criteria. If I had used one e.g. "where deptno = 10", then
SQL> -- only the first 3 rows would've been returned.
SQL> select empno, ename, sal, deptno, avg(sal) over (partition by deptno) as avg_sal_wndw from emp order by deptno;
  
     EMPNO ENAME             SAL     DEPTNO AVG_SAL_WNDW
---------- ---------- ---------- ---------- ------------
      7782 CLARK            2450         10   2916.66667
      7839 KING             5000         10   2916.66667
      7934 MILLER           1300         10   2916.66667
      7566 JONES            2975         20         2175
      7902 FORD             3000         20         2175
      7876 ADAMS            1100         20         2175
      7369 SMITH             800         20         2175
      7788 SCOTT            3000         20         2175
      7521 WARD             1250         30   1566.66667
      7844 TURNER           1500         30   1566.66667
      7499 ALLEN            1600         30   1566.66667
      7900 JAMES             950         30   1566.66667
      7698 BLAKE            2850         30   1566.66667
      7654 MARTIN           1250         30   1566.66667
  
 14 rows selected.
  
SQL>
SQL>

If the difference mentioned above is clear, then the alternative query should be clear as well.
You simply find out the average and std dev for each department id in a subquery and join it to the emp table on the department id.
That is shown below:

SQL>
SQL> select e.empno, e.ename, e.deptno, e.sal,
  2         x.avg_sal, x.std_dev_sal
  3    from emp e,
  4         (  select deptno,
  5                   avg(sal) as avg_sal,
  6                   stddev_pop(sal) as std_dev_sal
  7              from emp
  8             group by deptno
  9         ) x
 10   where e.deptno = x.deptno
 11   order by deptno, empno
 12  ;
  
     EMPNO ENAME          DEPTNO        SAL    AVG_SAL STD_DEV_SAL
---------- ---------- ---------- ---------- ---------- -----------
      7782 CLARK              10       2450 2916.66667  1546.14215
      7839 KING               10       5000 2916.66667  1546.14215
      7934 MILLER             10       1300 2916.66667  1546.14215
      7369 SMITH              20        800       2175  1004.73877
      7566 JONES              20       2975       2175  1004.73877
      7788 SCOTT              20       3000       2175  1004.73877
      7876 ADAMS              20       1100       2175  1004.73877
      7902 FORD               20       3000       2175  1004.73877
      7499 ALLEN              30       1600 1566.66667  610.100174
      7521 WARD               30       1250 1566.66667  610.100174
      7654 MARTIN             30       1250 1566.66667  610.100174
      7698 BLAKE              30       2850 1566.66667  610.100174
      7844 TURNER             30       1500 1566.66667  610.100174
      7900 JAMES              30        950 1566.66667  610.100174
  
 14 rows selected.
  
SQL>
SQL>

Window functions were introduced to remove the inefficiency in queries like the one above.
You are scanning the table once to determine the average and std dev per department in subquery x.
And then you are scanning the same table (emp) again in order to join the data in subquery x back to the table emp (with alias "e").
Imagine the amount of repetitive work if there are a huge number of subqueries like "x" above.

The same output using window functions is below.

SQL>
SQL> select empno, ename, deptno, sal,
  2         avg(sal) over (partition by deptno) as avg_sal,
  3         stddev_pop(sal) over (partition by deptno) as std_dev_sal
  4    from emp
  5   group by empno,ename,deptno,sal
  6   order by deptno, empno
  7  ;
  
     EMPNO ENAME          DEPTNO        SAL    AVG_SAL STD_DEV_SAL
---------- ---------- ---------- ---------- ---------- -----------
      7782 CLARK              10       2450 2916.66667  1546.14215
      7839 KING               10       5000 2916.66667  1546.14215
      7934 MILLER             10       1300 2916.66667  1546.14215
      7369 SMITH              20        800       2175  1004.73877
      7566 JONES              20       2975       2175  1004.73877
      7788 SCOTT              20       3000       2175  1004.73877
      7876 ADAMS              20       1100       2175  1004.73877
      7902 FORD               20       3000       2175  1004.73877
      7499 ALLEN              30       1600 1566.66667  610.100174
      7521 WARD               30       1250 1566.66667  610.100174
      7654 MARTIN             30       1250 1566.66667  610.100174
      7698 BLAKE              30       2850 1566.66667  610.100174
      7844 TURNER             30       1500 1566.66667  610.100174
      7900 JAMES              30        950 1566.66667  610.100174
  
 14 rows selected.
  
SQL>
SQL>

Here, the table emp is scanned only once and the calculated average and std dev are provided for all the rows returned.

Hope that helps.
Converting the syntax back to Sybase ASE should not be a big problem - Oracle does not require the "cast" function and my columns were different than yours.
I leave it to you as an exercise.