| 
                         当两张表进行联结的时候,如果表1中的数据行是否出现在结果集中需要根据表2中出现或不出现至少一个相匹配的数据行来判断,这种情况就会发生半联结;而反联结便是半联结的补集,它们会作为数据库中常见的联结方法如NESTED LOOPS,MERGE SORT JOIN,HASH JOIN的选项出现。 
实际上半联结和反联结本身也可以被认同是两种联结方法;在CBO优化模式下,优化器能够根据实际情况灵活的转换执行语句从而实现半联结和反联结方法,毕竟没有什么SQL语法可以显式的调用半联结和反联结,它们只是SQL语句满足某些条件时优化器可以选择的选项而已,不过仍然有必要深入这两种选项在特定情况下带来的性能优势。 
半联结
半联结通常都发生在使用含有IN和EXISTS的相关子查询的时候,=ANY的用法与IN相同,所以也会出现发生半联结的情况;不过也是有例外的,在11gR2版本中,优化器不会为任何包含在OR分支中的子查询选择半联结,这也是现在官档中唯一明确标识的限制条件,来看几种发生场景: 
 代码如下:
-- 使用IN关键字的相关子查询 => 发生NESTED LOOPS半联结 
SQL> select department_name 
 2 from hr.departments dept 
 3 where department_id in (select department_id from hr.employees emp); 
11 rows selected. 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 2605691773 
---------------------------------------------------------------------------------------- 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 
---------------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 | 
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 | 
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 | 
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 | 
---------------------------------------------------------------------------------------- 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 3 - access("DEPARTMENT_ID"="DEPARTMENT_ID") 
Statistics 
---------------------------------------------------------- 
 0 recursive calls 
 0 db block gets 
 11 consistent gets 
 0 physical reads 
 0 redo size 
 742 bytes sent via SQL*Net to client 
 524 bytes received via SQL*Net from client 
 2 SQL*Net roundtrips to/from client 
 0 sorts (memory) 
 0 sorts (disk) 
 11 rows processed 
-- 使用EXISTS关键字的相关子查询 => 发生NESTED LOOPS半联结 
SQL> select department_name 
 2 from hr.departments dept where exists 
 3 (select null from hr.employees emp where emp.department_id = dept.department_id); 
11 rows selected. 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 2605691773 
---------------------------------------------------------------------------------------- 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 
---------------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 | 
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 | 
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 | 
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 | 
---------------------------------------------------------------------------------------- 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 3 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID") 
Statistics 
---------------------------------------------------------- 
 1 recursive calls 
 0 db block gets 
 11 consistent gets 
 0 physical reads 
 0 redo size 
 742 bytes sent via SQL*Net to client 
 524 bytes received via SQL*Net from client 
 2 SQL*Net roundtrips to/from client 
 0 sorts (memory) 
 0 sorts (disk) 
 11 rows processed 
-- 谓语中使用了OR分支中的EXISTS子查询 => 禁用半联结 
SQL> select department_name 
 2 from hr.departments dept 
 3 where 1=2 OR exists 
 4 (select null from hr.employees emp where emp.department_id = dept.department_id); 
11 rows selected. 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 440241596 
---------------------------------------------------------------------------------------- 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 
---------------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT | | 27 | 432 | 4 (0)| 00:00:01 | 
|* 1 | FILTER | | | | | | 
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 | 
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 2 | 6 | 1 (0)| 00:00:01 | 
---------------------------------------------------------------------------------------- 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 1 - filter( EXISTS (SELECT 0 FROM "HR"."EMPLOYEES" "EMP" WHERE 
 "EMP"."DEPARTMENT_ID"=:B1)) 
 3 - access("EMP"."DEPARTMENT_ID"=:B1) 
Statistics 
---------------------------------------------------------- 
 1 recursive calls 
 0 db block gets 
 35 consistent gets 
 0 physical reads 
 0 redo size 
 742 bytes sent via SQL*Net to client 
 524 bytes received via SQL*Net from client 
 2 SQL*Net roundtrips to/from client 
 0 sorts (memory) 
 0 sorts (disk) 
 11 rows processed 
 
从结果集来看,我们很容易联想到内联结,那为什么半联结通常来说会获得更高的性能呢?这实际也是半联结优化的关键,拿NESTED LOOPS来举例,在NESTED LOOPS联结中,驱动表被读取后需要逐个的进入内层循环来进行匹配工作,并且只有当外层循环的数据行与内层循环中的每一行数据匹配运算完成后才会结束一个结果集的获取;而相对而言,半联结的区别在于数据集1中的每一条记录只返回一次,而不管数据集2中有几条匹配的记录,因此,半联结会在找到子查询中匹配到的第一条数据后立即结束处理从而提高性能。 
对于某些需要利用半联结来提高性能的场景,可以通过手动的方式控制半联结的执行计划,使用SEMIJOIN和NO_SEMIJOIN提示分别可以指定优化器使用和禁用半联结。 
 代码如下:
-- 使用NO_SEMIJOIN提示禁用半联结 
SQL> select department_name 
 2 from hr.departments dept 
 3 where department_id in (select /*+ no_semijoin */department_id from hr.employees emp); 
11 rows selected. 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 3372191744 
------------------------------------------------------------------------------------------ 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 
------------------------------------------------------------------------------------------ 
| 0 | SELECT STATEMENT | | 106 | 1802 | 4 (25)| 00:00:01 | 
| 1 | VIEW | VM_NWVW_2 | 106 | 1802 | 4 (25)| 00:00:01 | 
| 2 | HASH UNIQUE | | 106 | 2544 | 4 (25)| 00:00:01 | 
| 3 | NESTED LOOPS | | 106 | 2544 | 3 (0)| 00:00:01 | 
| 4 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 567 | 3 (0)| 00:00:01 | 
|* 5 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 4 | 12 | 0 (0)| 00:00:01 | 
------------------------------------------------------------------------------------------ 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 5 - access("DEPARTMENT_ID"="DEPARTMENT_ID") 
Statistics 
---------------------------------------------------------- 
 506 recursive calls 
 0 db block gets 
 188 consistent gets 
 7 physical reads 
 0 redo size 
 742 bytes sent via SQL*Net to client 
 524 bytes received via SQL*Net from client 
 2 SQL*Net roundtrips to/from client 
 10 sorts (memory) 
 0 sorts (disk) 
 11 rows processed 
 
除此之外,我们还可以使用_always_semi_join隐藏参数选择半联结的联结类型,有关_always_semi_join参数的可选值: 
 代码如下:
SQL> SELECT 
 2 PARNO_KSPVLD_VALUES pvalid_par#, 
 3 NAME_KSPVLD_VALUES pvalid_name, 
 4 VALUE_KSPVLD_VALUES pvalid_value, 
 5 DECODE(ISDEFAULT_KSPVLD_VALUES,'FALSE','','DEFAULT' ) pvalid_default 
 6 FROM 
 7 X$KSPVLD_VALUES 
 8 WHERE 
 9 LOWER(NAME_KSPVLD_VALUES) LIKE '%'||LOWER(nvl('&pname',name_kspvld_values))||'%' 
10 ORDER BY 
11 pvalid_par#, 
12 pvalid_default, 
13 pvalid_Value 
14 / 
                        (编辑:莱芜站长网) 
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! 
                     |