summaryrefslogtreecommitdiffstats
path: root/Objects/lnotab_notes.txt
blob: d247eddd6f0ed5e6dd0b73e392de9d6a20e9a762 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
All about co_lnotab, the line number table.

Code objects store a field named co_lnotab.  This is an array of unsigned bytes
disguised as a Python string.  It is used to map bytecode offsets to source code
line #s for tracebacks and to identify line number boundaries for line tracing.

The array is conceptually a compressed list of
    (bytecode offset increment, line number increment)
pairs.  The details are important and delicate, best illustrated by example:

    byte code offset    source code line number
        0		    1
        6		    2
       50		    7
      350                 307
      361                 308

Instead of storing these numbers literally, we compress the list by storing only
the increments from one row to the next.  Conceptually, the stored list might
look like:

    0, 1,  6, 1,  44, 5,  300, 300,  11, 1

The above doesn't really work, but it's a start. Note that an unsigned byte
can't hold negative values, or values larger than 255, and the above example
contains two such values. So we make two tweaks:

 (a) there's a deep assumption that byte code offsets and their corresponding
 line #s both increase monotonically, and
 (b) if at least one column jumps by more than 255 from one row to the next,
 more than one pair is written to the table. In case #b, there's no way to know
 from looking at the table later how many were written.  That's the delicate
 part.  A user of co_lnotab desiring to find the source line number
 corresponding to a bytecode address A should do something like this

    lineno = addr = 0
    for addr_incr, line_incr in co_lnotab:
        addr += addr_incr
        if addr > A:
            return lineno
        lineno += line_incr

(In C, this is implemented by PyCode_Addr2Line().)  In order for this to work,
when the addr field increments by more than 255, the line # increment in each
pair generated must be 0 until the remaining addr increment is < 256.  So, in
the example above, assemble_lnotab in compile.c should not (as was actually done
until 2.2) expand 300, 300 to
    255, 255, 45, 45,
but to
    255, 0, 45, 255, 0, 45.

The above is sufficient to reconstruct line numbers for tracebacks, but not for
line tracing.  Tracing is handled by PyCode_CheckLineNumber() in codeobject.c
and maybe_call_line_trace() in ceval.c.

*** Tracing ***

To a first approximation, we want to call the tracing function when the line
number of the current instruction changes.  Re-computing the current line for
every instruction is a little slow, though, so each time we compute the line
number we save the bytecode indices where it's valid:

     *instr_lb <= frame->f_lasti < *instr_ub

is true so long as execution does not change lines.  That is, *instr_lb holds
the first bytecode index of the current line, and *instr_ub holds the first
bytecode index of the next line.  As long as the above expression is true,
maybe_call_line_trace() does not need to call PyCode_CheckLineNumber().  Note
that the same line may appear multiple times in the lnotab, either because the
bytecode jumped more than 255 indices between line number changes or because
the compiler inserted the same line twice.  Even in that case, *instr_ub holds
the first index of the next line.

However, we don't *always* want to call the line trace function when the above
test fails.

Consider this code:

1: def f(a):
2:    while a:
3:       print 1,
4:       break
5:    else:
6:       print 2,

which compiles to this:

  2           0 SETUP_LOOP              19 (to 22)
        >>    3 LOAD_FAST                0 (a)
              6 POP_JUMP_IF_FALSE       17

  3           9 LOAD_CONST               1 (1)
             12 PRINT_ITEM          

  4          13 BREAK_LOOP          
             14 JUMP_ABSOLUTE            3
        >>   17 POP_BLOCK           

  6          18 LOAD_CONST               2 (2)
             21 PRINT_ITEM          
        >>   22 LOAD_CONST               0 (None)
             25 RETURN_VALUE        

If 'a' is false, execution will jump to the POP_BLOCK instruction at offset 17
and the co_lnotab will claim that execution has moved to line 4, which is wrong.
In this case, we could instead associate the POP_BLOCK with line 5, but that
would break jumps around loops without else clauses.

We fix this by only calling the line trace function for a forward jump if the
co_lnotab indicates we have jumped to the *start* of a line, i.e. if the current
instruction offset matches the offset given for the start of a line by the
co_lnotab.  For backward jumps, however, we always call the line trace function,
which lets a debugger stop on every evaluation of a loop guard (which usually
won't be the first opcode in a line).

Why do we set f_lineno when tracing, and only just before calling the trace
function?  Well, consider the code above when 'a' is true.  If stepping through
this with 'n' in pdb, you would stop at line 1 with a "call" type event, then
line events on lines 2, 3, and 4, then a "return" type event -- but because the
code for the return actually falls in the range of the "line 6" opcodes, you
would be shown line 6 during this event.  This is a change from the behaviour in
2.2 and before, and I've found it confusing in practice.  By setting and using
f_lineno when tracing, one can report a line number different from that
suggested by f_lasti on this one occasion where it's desirable.
Tcl is a high-level, general-purpose, interpreted, dynamic programming language. It was designed with the goal of being very simple but powerful.
summaryrefslogtreecommitdiffstats
path: root/generic/tclClock.c
blob: 67570dfb7bfd4d08af3ec4e43ab074a0268faadd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
/* 
 * tclClock.c --
 *
 *	Contains the time and date related commands.  This code
 *	is derived from the time and date facilities of TclX,
 *	by Mark Diekhans and Karl Lehenbauer.
 *
 * Copyright 1991-1995 Karl Lehenbauer and Mark Diekhans.
 * Copyright (c) 1995 Sun Microsystems, Inc.
 *
 * See the file "license.terms" for information on usage and redistribution
 * of this file, and for a DISCLAIMER OF ALL WARRANTIES.
 *
 * RCS: @(#) $Id: tclClock.c,v 1.28 2004/05/14 21:43:28 kennykb Exp $
 */

#include "tclInt.h"

/*
 * The date parsing stuff uses lexx and has tons o statics.
 */

TCL_DECLARE_MUTEX(clockMutex)

/*
 * Function prototypes for local procedures in this file:
 */

static int		FormatClock _ANSI_ARGS_((Tcl_Interp *interp,
			    unsigned long clockVal, int useGMT,
			    char *format));

/*
 *-------------------------------------------------------------------------
 *
 * Tcl_ClockObjCmd --
 *
 *	This procedure is invoked to process the "clock" Tcl command.
 *	See the user documentation for details on what it does.
 *
 * Results:
 *	A standard Tcl result.
 *
 * Side effects:
 *	See the user documentation.
 *
 *-------------------------------------------------------------------------
 */

int
Tcl_ClockObjCmd (client, interp, objc, objv)
    ClientData client;			/* Not used. */
    Tcl_Interp *interp;			/* Current interpreter. */
    int objc;				/* Number of arguments. */
    Tcl_Obj *CONST objv[];		/* Argument values. */
{
    Tcl_Obj *resultPtr;
    int index;
    Tcl_Obj *CONST *objPtr;
    int useGMT = 0;
    char *format = "%a %b %d %X %Z %Y";
    int clickType = 2;
    int dummy;
    unsigned long baseClock, clockVal;
    long zone;
    Tcl_Obj *baseObjPtr = NULL;
    char *scanStr;
    Tcl_Time now;		/* Current time */

    static CONST char *switches[] = {
	"clicks", "format", "scan", "seconds", (char *) NULL
    };
    enum command {
	COMMAND_CLICKS, COMMAND_FORMAT, COMMAND_SCAN, COMMAND_SECONDS
    };
    static CONST char *clicksSwitches[] = {
	"-milliseconds", "-microseconds", (char*) NULL
    };
    static CONST char *formatSwitches[] = {
	"-format", "-gmt", (char *) NULL
    };
    static CONST char *scanSwitches[] = {
	"-base", "-gmt", (char *) NULL
    };

    resultPtr = Tcl_GetObjResult(interp);
    if (objc < 2) {
	Tcl_WrongNumArgs(interp, 1, objv, "option ?arg ...?");
	return TCL_ERROR;
    }

    if (Tcl_GetIndexFromObj(interp, objv[1], switches, "option", 0, &index)
	    != TCL_OK) {
	return TCL_ERROR;
    }
    switch ((enum command) index) {
	case COMMAND_CLICKS:	{		/* clicks */
	    if (objc == 3) {
		if (Tcl_GetIndexFromObj(interp, objv[2], clicksSwitches,
			"option", 0, &clickType) != TCL_OK) {
		    return TCL_ERROR;
		}
	    } else if (objc != 2) {
		Tcl_WrongNumArgs(interp, 2, objv, "?option?");
		return TCL_ERROR;
	    }
	    switch (clickType) {
	    case 0: 		/* milliseconds */
		Tcl_GetTime(&now);
		Tcl_SetWideIntObj(resultPtr,
			((Tcl_WideInt) now.sec * 1000 + now.usec / 1000));
		break;
	    case 1:		/* microseconds */
		Tcl_GetTime(&now);
		Tcl_SetWideIntObj(resultPtr,
			((Tcl_WideInt) now.sec * 1000000 + now.usec));
		break;
	    case 2:		/* native clicks */
		Tcl_SetWideIntObj(resultPtr, (Tcl_WideInt) TclpGetClicks());
		break;
	    }

	    return TCL_OK;
	}

	case COMMAND_FORMAT:			/* format */
	    if ((objc < 3) || (objc > 7)) {
		wrongFmtArgs:
		Tcl_WrongNumArgs(interp, 2, objv,
			"clockval ?-format string? ?-gmt boolean?");
		return TCL_ERROR;
	    }

	    if (Tcl_GetLongFromObj(interp, objv[2], (long*) &clockVal)
		    != TCL_OK) {
		return TCL_ERROR;
	    }

	    objPtr = objv+3;
	    objc -= 3;
	    while (objc > 1) {
		if (Tcl_GetIndexFromObj(interp, objPtr[0], formatSwitches,
			"switch", 0, &index) != TCL_OK) {
		    return TCL_ERROR;
		}
		switch (index) {
		    case 0:		/* -format */
			format = Tcl_GetStringFromObj(objPtr[1], &dummy);
			break;
		    case 1:		/* -gmt */
			if (Tcl_GetBooleanFromObj(interp, objPtr[1],
				&useGMT) != TCL_OK) {
			    return TCL_ERROR;
			}
			break;
		}
		objPtr += 2;
		objc -= 2;
	    }
	    if (objc != 0) {
		goto wrongFmtArgs;
	    }
	    return FormatClock(interp, (unsigned long) clockVal, useGMT,
		    format);

	case COMMAND_SCAN:			/* scan */
	    if ((objc < 3) || (objc > 7)) {
		wrongScanArgs:
		Tcl_WrongNumArgs(interp, 2, objv,
			"dateString ?-base clockValue? ?-gmt boolean?");
		return TCL_ERROR;
	    }

	    objPtr = objv+3;
	    objc -= 3;
	    while (objc > 1) {
		if (Tcl_GetIndexFromObj(interp, objPtr[0], scanSwitches,
			"switch", 0, &index) != TCL_OK) {
		    return TCL_ERROR;
		}
		switch (index) {
		    case 0:		/* -base */
			baseObjPtr = objPtr[1];
			break;
		    case 1:		/* -gmt */
			if (Tcl_GetBooleanFromObj(interp, objPtr[1],
				&useGMT) != TCL_OK) {
			    return TCL_ERROR;
			}
			break;
		}
		objPtr += 2;
		objc -= 2;
	    }
	    if (objc != 0) {
		goto wrongScanArgs;
	    }

	    if (baseObjPtr != NULL) {
		if (Tcl_GetLongFromObj(interp, baseObjPtr,
			(long*) &baseClock) != TCL_OK) {
		    return TCL_ERROR;
		}
	    } else {
		baseClock = TclpGetSeconds();
	    }

	    if (useGMT) {
		zone = -50000; /* Force GMT */
	    } else {
		zone = TclpGetTimeZone((unsigned long) baseClock);
	    }

	    scanStr = Tcl_GetStringFromObj(objv[2], &dummy);
	    Tcl_MutexLock(&clockMutex);
	    if (TclGetDate(scanStr, (unsigned long) baseClock, zone,
		    (unsigned long *) &clockVal) < 0) {
		Tcl_MutexUnlock(&clockMutex);
		Tcl_AppendStringsToObj(resultPtr,
			"unable to convert date-time string \"",
			scanStr, "\"", (char *) NULL);
		return TCL_ERROR;
	    }
	    Tcl_MutexUnlock(&clockMutex);

	    Tcl_SetLongObj(resultPtr, (long) clockVal);
	    return TCL_OK;

	case COMMAND_SECONDS:			/* seconds */
	    if (objc != 2) {
		Tcl_WrongNumArgs(interp, 2, objv, NULL);
		return TCL_ERROR;
	    }
	    Tcl_SetLongObj(resultPtr, (long) TclpGetSeconds());
	    return TCL_OK;
	default:
	    return TCL_ERROR;	/* Should never be reached. */
    }
}

/*
 *-----------------------------------------------------------------------------
 *
 * FormatClock --
 *
 *      Formats a time value based on seconds into a human readable
 *	string.
 *
 * Results:
 *      Standard Tcl result.
 *
 * Side effects:
 *      None.
 *
 *-----------------------------------------------------------------------------
 */

static int
FormatClock(interp, clockVal, useGMT, format)
    Tcl_Interp *interp;			/* Current interpreter. */
    unsigned long clockVal;	       	/* Time in seconds. */
    int useGMT;				/* Boolean */
    char *format;			/* Format string */
{
    struct tm *timeDataPtr;
    Tcl_DString buffer;
    int bufSize;
    char *p;
    int result;
    time_t tclockVal;
#if !defined(HAVE_TM_ZONE) && !defined(WIN32)
    int savedTimeZone = 0;	/* lint. */
    char *savedTZEnv = NULL;	/* lint. */
#endif

#ifdef HAVE_TZSET
    /*
     * Some systems forgot to call tzset in localtime, make sure its done.
     */
    static int  calledTzset = 0;

    Tcl_MutexLock(&clockMutex);
    if (!calledTzset) {
        tzset();
        calledTzset = 1;
    }
    Tcl_MutexUnlock(&clockMutex);
#endif

    /*
     * If the user gave us -format "", just return now
     */
    if (*format == '\0') {
	return TCL_OK;
    }

#if !defined(HAVE_TM_ZONE) && !defined(WIN32)
    /*
     * This is a kludge for systems not having the timezone string in
     * struct tm.  No matter what was specified, they use the local
     * timezone string.  Since this kludge requires fiddling with the
     * TZ environment variable, it will mess up if done on multiple
     * threads at once.  Protect it with a the clock mutex.
     */

    Tcl_MutexLock(&clockMutex);
    if (useGMT) {
        CONST char *varValue;

        varValue = Tcl_GetVar2(interp, "env", "TZ", TCL_GLOBAL_ONLY);
        if (varValue != NULL) {
	    savedTZEnv = strcpy(ckalloc(strlen(varValue) + 1), varValue);
        } else {
            savedTZEnv = NULL;
	}
        Tcl_SetVar2(interp, "env", "TZ", "GMT0", TCL_GLOBAL_ONLY);
        savedTimeZone = timezone;
        timezone = 0;
        tzset();
    }
#endif

    tclockVal = clockVal;
    timeDataPtr = TclpGetDate(&tclockVal, useGMT);

    /*
     * Make a guess at the upper limit on the substituted string size
     * based on the number of percents in the string.
     */

    for (bufSize = 1, p = format; *p != '\0'; p++) {
	if (*p == '%') {
	    bufSize += 40;
	} else {
	    bufSize++;
	}
    }

    Tcl_DStringInit(&buffer);
    Tcl_DStringSetLength(&buffer, bufSize);

    /* If we haven't locked the clock mutex up above, lock it now. */

#if defined(HAVE_TM_ZONE) || defined(WIN32)
    Tcl_MutexLock(&clockMutex);
#endif
    result = TclpStrftime(buffer.string, (unsigned int) bufSize, format,
	    timeDataPtr, useGMT);
#if defined(HAVE_TM_ZONE) || defined(WIN32)
    Tcl_MutexUnlock(&clockMutex);
#endif

#if !defined(HAVE_TM_ZONE) && !defined(WIN32)
    if (useGMT) {
        if (savedTZEnv != NULL) {
            Tcl_SetVar2(interp, "env", "TZ", savedTZEnv, TCL_GLOBAL_ONLY);
            ckfree(savedTZEnv);
        } else {
            Tcl_UnsetVar2(interp, "env", "TZ", TCL_GLOBAL_ONLY);
        }
        timezone = savedTimeZone;
        tzset();
    }
    Tcl_MutexUnlock(&clockMutex);
#endif

    if (result == 0) {
	/*
	 * A zero return is the error case (can also mean the strftime
	 * didn't get enough space to write into).  We know it doesn't
	 * mean that we wrote zero chars because the check for an empty
	 * format string is above.
	 */
	Tcl_AppendStringsToObj(Tcl_GetObjResult(interp),
		"bad format string \"", format, "\"", (char *) NULL);
	return TCL_ERROR;
    }

    Tcl_SetStringObj(Tcl_GetObjResult(interp), buffer.string, -1);

    Tcl_DStringFree(&buffer);
    return TCL_OK;
}