[prev] [prev-tail] [tail] [up]

Fragments of Course Notes...

o Intro
o Scanner
o Top-Down Parser
o Syntaxt-Directed Code Generator
o Interfacing LL(1) Parsers with Code Generators
o Botom-Up Parsing
o Interfacing LR(1) with LL(1) and Code Generator
o Lex: A Scanner Generator
o Yacc: A Parser Generator
o Symbol Tables
o Storage Organization
o Run-Time Storage Managment
o Code Generation for Basic Blocks
o Program Analysis
o Code Improvement


Intro


Scanner

Example: IF, END, LET, ids, integers, +, -, *, /, (, ), /*...*/


Top-Down Parser


Syntaxt-Directed Code Generator

                                  <program>
                                      1|
         |------------------------------------------------------|
       <dcl>                         <body>                      |
        2                             4                      END
         |                             |
  |-------------|           |---------------------|
  |      |    <dcl>       <stat>                 <body>
DCL    id:x    3           6                     4
                |           |               |-----------|
                |
                         <stat1>           <stat>      <body>
               e          13|               6          5|
                        |------|            |           |
                        |      |                        |
                    READ      id:x        <stat1>        e
                                           14
                                        |-------|

                                    WRITE      id:x

Assumptions:


                                       <program>
                                            1
        |-------------------------------------------------------|
      <dcl>                              <body>                 |
        2                                   4                END
        |                                   |
  |-----------|                 |-----------------------|
  |     |   <dcl>             <stat>                  <body>
DCL    <0>    3                 6                      5|
              |                 |                       |
              |                                         |
              e              <stat1>                   e
                                9
                     |------|--------------|
                     |      |    |
                   LET    <0>   =       <expr>
                                          20
                                           |
                                        <expr1>
                                          22
                                           |
                                           |
                                        <term>
                                          24
                                           |

                                        <oprnd>
                                          27
                                     |------------|
                                     |            |
                                    (   <expr>    )
                                          20
                                           |
                                        <expr1>
                                          22
                                           |

                                        <term>
                                          24
                                           |

                                        <oprnd>
                                          25
                                           |
                                          <0>


Interfacing LL(1) Parsers with Code Generators

Old action. Remove nonterminal from top of syntax stack, call to `print(i)', and push the reverse of the right hand-side of rule i in stack

Revised action. Remove nonterminal from top of syntax stack, push i to stack, and push the reverse of the right hand-side of rule i into the stack. Upon encountering i call `semantics(i)'.

Botom-Up Parsing


BASIC PRINCIPLES

Input: b ( ( a a ) a ) b

Grammar:

1. S --> bAb
2. A --> (B
3. --> a
4. B --> Aa)

            |--------|   ---------|   ---------|
--------|   |S --x b.Ab|   |A --x (.B |   |A --x  .B  |    -------|
S---x--.bAb---||A --x  .(B --||-B --x .Aa)--||-B --x .Aa)----||A---x-a.
             ----x-.a---   |A --x .(B |   |A --x .(B | ||||
                         --A---x-.a-    --A---x-.a-|||
                                            ||||
                                          *|------|B----x-A.a)--||B----x-Aa.)--||B---x--Aa).|
                                          --       ---------|   ----------------------|
                                             |  ------------------
                                          *--------|A---x-(B.|
                                           -----------------
                               |---------------    ---------    ---------
                             *-------|-B --x A.a)--||B  --x Aa.)--||B  --x Aa).|
                                      ---------|  -----------------------|
                               |------------------
                             *--------|A --x (B. |
                            -------------------
                --||--------------|   ---------|
                *-------||S---x-bA.b---||S---x-bAb.

"State" = nonterminal symbol augmented with info from production rules


LR(0)

Added Rule:

S' --> S

  |||||      -||||||
--|'  ||-   -S--xb.Ab--   --||||||-   --||||||-
-SS--x--x..bSAb---b-- A--x.(B  -A---S--xbA.b---b--S--xbAb.--
-||I0||-    --|--xI.1a|||    ||I2|||     ||I3|||
    |         ||||| |||   -|||||
  S |          a|   ( |||(-||||-
 -||||||     --||||-    -|S--x(.B --    -||||||
 -S'--xS. |     -A--xa. ------B--x.Aa) --B-|-A--x(B.--
 ||I4||-     -|I5|---a----A--x--x..(Ba --    ||I7||-
               -A----    -|I6||-
   ||||   ------|||        ||||
--||  ||--a |-||  ||- ) |-||  ||-
-B--xAI.a)-----|B--xAIa.)-----|B--xAIa).--
  ||8||       ||9||       ||10|


PARSE TABLE

ab()$ABSS'
I-0 S-1 S-4
I-1 S-5 S-6 S-2
I-2 S-3
I-3 R-3,SR-3,SR-3,SR-3,SR-3,S
I-4 accept
I-5 R-1,AR-1,AR-1,AR-1,AR-1,A
I-6 S-5 S-6 S-8S-7
I-7 R-2,AR-2,AR-2,AR-2,AR-2,A
I-8 S-9
I-9 S-10
I-10 R-3,BR-3,BR-3,BR-3,BR-3,B

Not LR(0)

Grammar:

S --> A|B
A --> aAc|a
B --> aB|C
C --> bCc|b

  --|||||          -|||||-         -|||||     -|||||||
  -S--xA. |          -S--xB. -        -Bx--C. --    -B--xaB. -
  -|I4||-        ||||I5|-- -----|||||I6|------|||I7||-
   A||||     B|||   -C------ C||| --B----
 -S--x.A|.B|||||--------|||||||------- ||||        ||||
--A--x.aAc|.a --a--A--xa.Ac|a.|.aAc|.a- A|--||  ||| c |-||  |||-
--B--x.aB| .C ---|-| B--xCa--x..Bb| .CabB|.| .bC --||-||A--xaIA2.c----||A--xaIA3c.--
 -C--x.IbCc|.b-|||  ||||||I1|-|||||--   ||||||      ||||||
   ||0|||   b||||   b | ||||||a
               ||||||||||||||     -|||||||    -|||||||
              -|C--xb.Cc|b.|.bCc|.b|-|C-|-C--xbC.c--c-|-C--xbCc.-
               ||||||I8|||||||||  |||I9||-    ||I10||-
                         |||||b


LR(1)

                     |||||||||            ||||||||||         ||||||||
                    -A--x(aAc.),$ ------------Ax--(aA.c),$||       -Bx--aB.,$|-
                    ||||I11||||-    c     -|||I12|||--     ----||I13||--
     ||||               ||||            ||||         ----------|||||
  --||  ||-          --|   ||-        |||       ------   ----|||   |||-
  -S--xAI.,$-|          |B--xCI.,$||     A||     --B---      B-- -|A--x(IaAc.),c--
    ||14||     C ||||||||15|||------|  ------         ---     |||16|||
  ||A||||||  |||||  |||C|||||||||C-----  ||||||||||--          c |
--S--x(.A|.B),$|||    --A--x(a.Ac|a.),$|---    ---|          |||-    ||||||||||
-A--x(.aAc|.a),$ --a-||-B--x(a.B|.aB|.C),$---a-|-A--xB(--xa(.Aac.|Ba.| .|.aaBA| .cC|.)a),$,c|-A-|-A--x(aA.c),c|-
-BC--x--x((..abBC| .cC|.b)),$,$--    -- AC--x--x((..abACc|b|.a.b)),c,$ --||   -|  C--x(.bCb|.b),$  --   -||||I3|||--
 ||||I0|||||||||   |||||I1 ||||-|--  |||||||||I2|||||||
    B|||      b||||     b||||||||a|b||
   |||||||        |||||||||||||||         ||||||||||        ||||||||||
  -S--xB.,$--        -C--x(b.Cc|b.),$-----C------Cx--(bC.c),$|---c---|-C--x(bCc.),$|-
  -||I4||--        -C--x(.bICc|.b),c--         -||||I6|||--       ||||I7|||--
                      |||5|||
                 |||||||b|||||||||        ||||||||||        ||||||||||
                -|C--x(b.Cc|b.|.bCc|.b),c----C---|Cx--(bC.c),c----c---|-C--x(bCc.),c|-
                 |||||||I8||||||||||      ||||I9|||-        |||I10|||--
                            ||||| b


LALR(1)

LR(1)-oriented, where states with identical cores are merged


SLR(1)

LR(0) + computed lookahed symbols for cores

    -|||||||              -|||||||            -|||||||        |||||||||
    -S--xA.,$|             --S--xB.,$ -            -B--xC.,$--      --B--xaB.,$ -
    |||I4||-            |||||I5|||-      -----||||I6||-    ----|||I7|||-
       ||            ||||        -------- ||||      ------
      A |        |B|||     --C----     |C||   --B---
   |||||||||||||||  -------|||||||||||| -------
--|S--x(.A|.B),$ ||------||||          |||---    ||||||||||      |||||||||||
-A--x(.aAc|.a),{c,$} --a|-A--x(aB.--xA(ca|a..|B.a| .AacB|| ..aC),){,$c,$}--A-|-A--xaA.c,{c,$}|-c-|-A--xaAc.,{c,$}|-
-C--xB(--x.(b.aCBc||..Cb)),{,c$,$}--   -||  Cx--(.bCb|.b),$   |-||  |||||I2||||--   |||||I3||||--
 ||||||I0||||||||    ||||||||I1|||||||||a|-
      ||||      ||||         |  ||||||
                 b |||||    b|
                      ||||||||||||||||       ||||||||||       |||||||||
                   --C--x(b.Cc|b.|.bCc|.b),{c,$}--C---C--xbC.c,{c,$}|-c-|-C--xbCc.,{c,$}|-
                   -|||||||| I8  ||||||||||  |||||I9||||--   |||||I10||||--
                           ||||||||||||b        ||||            ||||


EXAMPLE

Grammar:

E --> E+T|T
T --> T*F|F
F --> (E)|id


SLR(1) but not LR(0)
S' --> S
S --> AS
--> b
A --> Sa
--> a
FOLLOW(S') = {$}
{S} {} {$, a}
{A} {} {a, b}

    -||||
   -S'--x.S--       |||||||||       |||||||||
  --S--x.AS -- S   -S'--xS.,{$}|- a  --|       |||
  - A--x--x..bSa--------- A--xS.a--------A--xSa.,{a,$}-|
  -- --x.a ||      ||||I1|||--      |||I2|||||
   -|I0|---|||      -|||
      |   -A |||  A -||||
      |    --  ||||-|  |-
     a|     b--   |SS--x--xA..ASS--        -||||||
      |       --  - --x.b  ----S---|-A--xS.a--
      |        -- -A--x.Sa--        ||I4||-
      |          -----x.I3a--           a |
      |           --|||               |
 -||||||||||    -||||||||||       -||||||
--A--xa.,{a,b}-|   -S--xb.,{a,$}--      -A--xSa.--
 ||||I5||||-    |||||I6||||--      ||I7||--


LALR(1) but not SLR(1)
S --> Aa
--> bAc
--> dc
--> bda
A --> d
FOLLOW(S) = {$}
{A} {} {a, c}

 --||||-
-S--x.Aa,$--      ||||         ||||
---x.bAc,$ - A --||  |||- a --||  |||-
---x--x..bdcd,a$,$ ----||S--xAI.a,$-----||S--xAIa.,$-|
-Ax--.d,a--     |||1||       |||2||
 -|I0|-----
    |    b --- |||||
    |        |||   ||-    ||||||||
  d |       -S--x--xbb..Adac,,$$ -A-|-S--xbA.c,$||
    |       --A--x.d,c--   -|||I4||--
    |        -||I3||-       c |
  |||||       |d||||         ||
-S--xd.c,$|-   -S--xbd.a,$|-   --|||||||-
-Ax--d.,a-|   --A--xd.,c-|   -S--xbAc.,$-|
-||I5||-     |||I6||-     |||I7|||
  |c|||       |c||||
--|   ||-   --|    ||-
-S--xdIc8.,$--   -S--xbdIa9.,$--
  |||||        |||||


LR(1) but not LALR(1)
S --> Aa
--> bAc
--> Bc
--> bBa
A --> d
B --> d

  -||||||      ||||||        ||||||
 -A--xd.,a-     -|    ||--c-|-|     ||
 -B--xd.,c-     |S--xBI.2c,$--    -S--xBIc3.,$|-
  ||I1||     |||||||          |||
   d||||  B|||
 -S--x.Aa,$-|||
 ---x.bAc,$--    -|||||||     -||||||||
----x.Bc,$ --A-|-S--xA.a,$---a-|-S--xAa.,$ |
 ---xA.--xbB.ad,,$a--    |||I4||--    |||I5|||-
 -B--x.d,c-
  -|I0|-
  |b||||
 S--xb.Ac,$|-     ||||||       |||||||
----xb.Ba,$ --A||-|     |--c-|-|     |-|
--A--x.d,c -   -|S--xbIA7.c,$|-   -|S--xbIA8c.,$-
 -B--x.Id,a|||     |||||        |||||
   d|6||  ||||
  -||||||B   |||||||||      |||||||
 -A--xd.,c-    -S--xbB.a,$|-a-|-S--xbBa.,$||
 -B--xdI.,a-    -|||I10|||-   -|||I11|||-
  |||9||

NOTE: In LALR we should merge states I1 and I9 since they have the same core

                                        ||||||||||
                                       -A--xd.,{a,c}  ||
                                       -B--xd.,{a,c} |--
                                         |I1 +|I9|


Interfacing LR(1) with LL(1) and Code Generator

Push start symbol `<program' to pda stack, and repeatedly act according to the entry on top of the stack.

Nonterminal symbol other than <expr>. Consult LL(1) parse table, pop nonterminal, and push `rule number+right handside' in reverse order.

Terminal. Match against input, and put semantics info in INFO stack.

Rule number. Call semantics action.

Empty. Terminate compilation

<expr>. Replace with start state of LR(1) parser.

State. Shift/reduce according to LR(1) parse table. On reduce call semntics action; on shift push semantics info of token to INFO.

Lex: A Scanner Generator

                     |-------------|       function yyLEX()
                  -| |   -----------------------
auxiliary definitions-----          |          function ScannerActions()
%%                   |    LEX      |            case i of
translation rules     |  ------------------------1:
P1  A1-------------------          |            2:
P2  A2------------|-----------------------------||
. . .  ------------|||             |            3:
%%                 |||             |             . . .
user routines-------------||       -----         end
                     |     ||||    |--------- end
                     |--------||||--      -------------------------|
                                 |||          |--table-driven-scanner
                                    |||       |     ----|-      |  |
                                      ||||    |---------|-table-----
                                         |||-end
                                            --

auxiliary definitions:

 
        letter = A | B | ... | Z
        digit  = 0 | 1 | ... | 9
Translation rules
 
        BEGIN                   {return 1}
        END                     {return 2}
        letter(letter|digit)*   {yyval := lookid(yytext);  /* value */
                                 return 3                  /* class */  }

Yacc: A Parser Generator

                    |-------------|        function yyparse()
                  | |   -----------------------
auxiliary definitions------         |           function SemanticActions()
%%                  |     yACC    |             case i of
translation rules    |   ------------------------1:
P1 A1--------------------         |             2:
P2 A2 ------------|-----------------------------||
. . .  -----------|||             |             3:
%%                 ||             |              . . .
user routines-------------||      |----         end
                    |      ||||   | --------  end
                    |---------|||--      --------LR(1)--table driven|parser
                                 |||          ---------  |------|  |
                                   ||||       |    -------      |  |
                                      ||||    |--------||-table-----
                                         |||end
                                            --


Symbol Tables


Storage Organization

Scalars. address=(data area number,offset)
Arrays.
1 dimension. Continuous memory: address = base + c * i.
2 dimensions.
  • Continuous memory: address = base + c * row + d * col.
  • Tree structure with root for pointers to rows and a continuous area for each row: row(a + i) + c * j + d.
  • Hash functions for sparse arrays.
Dynamic Arrays. Use dope vectors
Sets. Bit patterns
Strings. Bounded and unbounded length.

Run-Time Storage Managment


Code Generation for Basic Blocks


Program Analysis

Control Flow Analysis.
  1. Partition into basic blocks. Mark: first instruction, each target of a goto, each successor of a goto.
  2. Determine Successor relationships between Basic Blocks

Depth first ordering.

Denominators.

Loop detection.

Live variable analysis.

UD (usage-define) chaining.

Code Improvement

[prev] [prev-tail] [front] [up]