Hi Team -
I am using Sybase ASE 15.7 version. Below query is throwing an error
stating Error : incorrect syntax near the keyword 'OVER'
SELECT
EMPLOYEE_ID ,
EMPLOYEE ,
Department,
CAST( Salary as DECIMAL( 10, 2 ) ) AS Salary
CAST(AVG( Salary) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) ) AS Average,
CAST(STDDEV_POP( Salary ) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) ) AS StandardDeviation
FROM EMP_Table
GROUP BY EMPLOYEE_ID,EMPLOYEE, Department, Salary
Please help me understand what is missed here ? Any solution appreciated.
The following query was tried to returns a result set that shows the employees whose salary is one standard deviation greater than the average salary of their department.
So the below OVER was used for partitioning based on department value. For some reason , its not working in sybase ASE and throwing error .
CAST(AVG( Salary) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) ) AS Average,
CAST(STDDEV_POP( Salary ) OVER ( PARTITION BY Department ) AS DECIMAL ( 10, 2 ) ) AS StandardDeviation
please guide if this can be tried in different way or am i missing anything in above query . Thanks in advance.
The syntax consisting of "over ... partition by ..." etc. belongs to a special class of database functions called "window functions" or "analytic functions".
These were defined in the ANSI/ISO SQL 2003 version and implemented by a few databases like Oracle, SQL Server, PostgreSQL, DB2 etc.
Sybase ASE 15.7 is not one of those databases; it does not support window functions, hence the error message.
My guess is that you got that syntax from a place dedicated to a database other than Sybase ASE 15.7
Of course, it can be done without window functions but it will be inefficient.
The first thing you need to keep in mind is that each aggregate function has its own window function counterpart.
So:
=> the "avg" aggregate function returns the average for each distinct set of group-by columns for all the rows satisfied by the filter criteria. The number of rows returned is typically less than that satisfied by the filter criteria.
=> the "avg" window function returns the average for the set of partition-by columns for all the rows satisfied by the filter criteria. The number of rows returned is always equal to that satisfied by the filter criteria.
Here's an example using Oracle 11g; I don't have Sybase ASE at the moment.
SQL> -- My data looks like the following
SQL> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 09-DEC-82 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 12-JAN-83 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
SQL>
SQL> -- The "AVG" function used as an aggregate function. Notice that only 3 rows are returned,
SQL> -- although I did not use any filter criteria i.e. no "WHERE" and "AND" clauses.
SQL> select deptno, avg(sal) as avg_sal_agg from emp group by deptno order by deptno;
DEPTNO AVG_SAL_AGG
---------- -----------
10 2916.66667
20 2175
30 1566.66667
3 rows selected.
SQL>
SQL> -- The "AVG" function used as a window function. Notice that all 14 rows are returned, and the
SQL> -- average value is the same that you'd expect for each department (as per the query above). The 14 rows
SQL> -- were returned because I did not use any filter criteria. If I had used one e.g. "where deptno = 10", then
SQL> -- only the first 3 rows would've been returned.
SQL> select empno, ename, sal, deptno, avg(sal) over (partition by deptno) as avg_sal_wndw from emp order by deptno;
EMPNO ENAME SAL DEPTNO AVG_SAL_WNDW
---------- ---------- ---------- ---------- ------------
7782 CLARK 2450 10 2916.66667
7839 KING 5000 10 2916.66667
7934 MILLER 1300 10 2916.66667
7566 JONES 2975 20 2175
7902 FORD 3000 20 2175
7876 ADAMS 1100 20 2175
7369 SMITH 800 20 2175
7788 SCOTT 3000 20 2175
7521 WARD 1250 30 1566.66667
7844 TURNER 1500 30 1566.66667
7499 ALLEN 1600 30 1566.66667
7900 JAMES 950 30 1566.66667
7698 BLAKE 2850 30 1566.66667
7654 MARTIN 1250 30 1566.66667
14 rows selected.
SQL>
SQL>
If the difference mentioned above is clear, then the alternative query should be clear as well.
You simply find out the average and std dev for each department id in a subquery and join it to the emp table on the department id.
That is shown below:
SQL>
SQL> select e.empno, e.ename, e.deptno, e.sal,
2 x.avg_sal, x.std_dev_sal
3 from emp e,
4 ( select deptno,
5 avg(sal) as avg_sal,
6 stddev_pop(sal) as std_dev_sal
7 from emp
8 group by deptno
9 ) x
10 where e.deptno = x.deptno
11 order by deptno, empno
12 ;
EMPNO ENAME DEPTNO SAL AVG_SAL STD_DEV_SAL
---------- ---------- ---------- ---------- ---------- -----------
7782 CLARK 10 2450 2916.66667 1546.14215
7839 KING 10 5000 2916.66667 1546.14215
7934 MILLER 10 1300 2916.66667 1546.14215
7369 SMITH 20 800 2175 1004.73877
7566 JONES 20 2975 2175 1004.73877
7788 SCOTT 20 3000 2175 1004.73877
7876 ADAMS 20 1100 2175 1004.73877
7902 FORD 20 3000 2175 1004.73877
7499 ALLEN 30 1600 1566.66667 610.100174
7521 WARD 30 1250 1566.66667 610.100174
7654 MARTIN 30 1250 1566.66667 610.100174
7698 BLAKE 30 2850 1566.66667 610.100174
7844 TURNER 30 1500 1566.66667 610.100174
7900 JAMES 30 950 1566.66667 610.100174
14 rows selected.
SQL>
SQL>
Window functions were introduced to remove the inefficiency in queries like the one above.
You are scanning the table once to determine the average and std dev per department in subquery x.
And then you are scanning the same table (emp) again in order to join the data in subquery x back to the table emp (with alias "e").
Imagine the amount of repetitive work if there are a huge number of subqueries like "x" above.
The same output using window functions is below.
SQL>
SQL> select empno, ename, deptno, sal,
2 avg(sal) over (partition by deptno) as avg_sal,
3 stddev_pop(sal) over (partition by deptno) as std_dev_sal
4 from emp
5 group by empno,ename,deptno,sal
6 order by deptno, empno
7 ;
EMPNO ENAME DEPTNO SAL AVG_SAL STD_DEV_SAL
---------- ---------- ---------- ---------- ---------- -----------
7782 CLARK 10 2450 2916.66667 1546.14215
7839 KING 10 5000 2916.66667 1546.14215
7934 MILLER 10 1300 2916.66667 1546.14215
7369 SMITH 20 800 2175 1004.73877
7566 JONES 20 2975 2175 1004.73877
7788 SCOTT 20 3000 2175 1004.73877
7876 ADAMS 20 1100 2175 1004.73877
7902 FORD 20 3000 2175 1004.73877
7499 ALLEN 30 1600 1566.66667 610.100174
7521 WARD 30 1250 1566.66667 610.100174
7654 MARTIN 30 1250 1566.66667 610.100174
7698 BLAKE 30 2850 1566.66667 610.100174
7844 TURNER 30 1500 1566.66667 610.100174
7900 JAMES 30 950 1566.66667 610.100174
14 rows selected.
SQL>
SQL>
Here, the table emp is scanned only once and the calculated average and std dev are provided for all the rows returned.
Hope that helps.
Converting the syntax back to Sybase ASE should not be a big problem - Oracle does not require the "cast" function and my columns were different than yours.
I leave it to you as an exercise.