Cost of Retrieval

R has the signature MARK/(studnum, course, assignnum, mark)

E is the query
π#1,#4(σ#2=#7(σ#1=#6(σ#4>#5(MARK×90)×100)×PHYS))

select studnum, mark from MARK
where course = 'PHYS' and studnum = 100 and mark > 90
Physical design for MARK
  1. Btree primary index on course
    • MARK-PINDEX/(RID-P, studnum, course, assignnum, mark)
  2. Btree secondary index on studnum
    • MARK-SINDEX/(RID-S, studnum, RID-P)
Statistical metadata
  1. |MARK| = 10000
  2. b(MARK-PINDEX) = 50
  3. distinct(MARK, studnum) = 500 (number of different students)
  4. distinct(MARK, course) = 100 (number of different courses)
  5. distinct(MARK, mark) = 100 (number of different marks)
Strategy 1: Use Primary Index

Query plan
π#2,#5(σ#2=#7(σ#5>#6(σ#3=PHYS(MARKPINDEX)×90)×100))

Selection of N tuples from relation R using a clustered primary key has cost 2+N/b(R)

Searching Btree has cost 2

Whether its for primary or secondary index

Strategy 2: Use Secondary Index

Query Plan:
π#5,#8(σ#6=#10(σ#8>#9((σ#2=100(MARKSINDEX)×σ#1=#3l(MARKPINDEX))×90)×PHYS))

Nested index join

#3l refers to column 3 of the left argument of the nested cross product operator ×

Selection of N tuples from relation R using an unclustered secondary index has a cost of 2+N

Strategy 3: Scan Primary Index

Query Plan:
π#2,#5(σ#2=#8(σ#5>#7(σ#3=#6(MARKPINDEX×PHYS)×90)×100))

Selection of N tuples from relation R by an exhaustive scan of its primary index has a cost of |R|/b(R)