Results from running memtest86 on three computers:
Memtest86 Summary
AMD Athlon XP 2100
Intel Pentium M 1.3GHz
AMD Athlon 64 3400
(socket754)
Operating
Frequency
1.7GHz
1.3GHz
2.4GHz
L1 Size(KiB)
128
64
128
L2 Size(KiB)
256
1024
512
RAM Size(MiB)
768
496
1024
L1 Rate(MiB/s)
10602
16034
19664
L2 Rate(MiB/s)
3375
7967
4886
RAM Rate(MiB/s)
505
939
1441
Chipset
VIA KT400(A)/600
Intel i855GM/GME
FSB:99MHz
SIS 760/M760
Settings
DDR266
RAM: 132MHz (DDR264)
CAS: 2-3-2-6
DDR400
Notes:
memtest misidentified the operating frequency of the Athlon64. It should be 2.2GHz.
The Athlon64, being on socket 754, only have a single channel to the RAM.
Athlon64 on socket 939 (I don't have one) would have two channels to the RAM. Of course, to take advantage of the second channel, you'd have to have a second DIMM module.
I have no idea how to test the decrease in latency of RAM access AMD touts in Athlon64 with its built-in memory controller.
(originally from http://microjet.ath.cx/WebWiki/Amd64HyperTransportSingleChannel.html)
A common problem with webapps is providing an interface to page through result set like what various search engines do.
I have yet to find a website that discuss this in depth. So, here is a summary of solutions I came up with by looking at various pieces in the Web for doing result pagination with Postgresql. I hope this would give the sorely needed encouragement for people to start sharing their findings.
The problem actually has three components:
displaying result for a certain page,
not causing undue latency in page display, and
counting number of results.
Counting the accurate number of results would almost always require the full result set to be counted. Applications would usually cache this number. How exactly is the counting done depends on the approach taken.
And there are two basic approaches:
operating on piecemeal result set
operating on full result set
Please realise that there is no 'best' approach. Each comes with its own pros and cons.
To illustrate the pros and cons, I am employing two kind of queries: cheap and expensive. I categorise queries according to the effort exacted from pgsql: cheap and expensive. This categorisation is only for simplicity purposes as there are, of course, grey areas, queries that are neither cheap nor expensive; not to mention that cheap and expensive are subjective terms anyway.
dms3_test=> create view cheap as select id from document;
CREATE VIEW
Time: 150.078 ms
dms3_test=> create view expensive as SELECT doc.id FROM
attribute as attr0, attribute_name as an0, document as doc, state,
document_attribute as da0 WHERE state.id = doc.state_id AND state.name
= 'new' AND da0.doc_id = doc.id AND attr0.id = da0.attribute_id AND
an0.id = attr0.name_id AND an0.name = 'ssis_client' AND attr0.value
ILIKE 'client1';
CREATE VIEW
Time: 120.979 ms
Since what is pro and what is con depend very much against the context, I simply list the characteristics of each approach without further labelling.
Operating on Piecemeal Result Set (New Query for Each Page)
In this approach, a new query is run for each page. Each query differs only in OFFSET and LIMIT clauses.
For example, for the first page, the query would be executed with OFFSET 0 and LIMIT 10. The second page would be OFFSET 11 LIMIT 10.
This approach is popular and is found in various web applications. It is simple to implement and has an acceptable performance on cheap queries.
Number of matching results could be counted with a SELECT COUNT(*) in the beginning. This number could be cached as well so as to reduce the load on the server.
The latency in page display is low in the beginning and degrades linearly as user moves deeper.
The problem with this approach is it does not reuse previous effort. This is especially problematic if the query is expensive.
Another problem is each query would potentially see different snapshot of the data. If user is browsing page n and the underlying data changes, refreshing or revisiting page n would show a different data. cheap query
dms3_test=> abort;
begin;
select count(*) from cheap;
select * from cheap order by id offset 0 limit 10;
select * from cheap order by id offset 50000 limit 10;
ROLLBACK
Time: 1.640 ms
BEGIN
Time: 6.187 ms
count
---------
1010431
(1 row)
Time: 1094.187 ms
id
----
1
2
3
4
5
6
9
10
11
12
(10 rows)
Time: 57.371 ms
id
-------
50003
50004
50005
50006
50007
50008
50009
50010
50011
50012
(10 rows)
Time: 134.610 ms
expensive query
dms3_test=> abort;
begin;
select count(*) from expensive;
select * from expensive order by id offset 0 limit 10;
select * from expensive order by id offset 50000 limit 10;
ROLLBACK
Time: 2.698 ms
BEGIN
Time: 4.584 ms
count
-------
68276
(1 row)
Time: 18034.510 ms
id
-----
6
50
55
65
89
109
110
133
144
155
(10 rows)
Time: 76.929 ms
id
--------
749659
749661
749667
749685
749692
749720
749732
749740
749741
749778
(10 rows)
Time: 14424.053 ms
Characteristics:
simple implementation
no setup cost
suitable for cheap queries
suitable if user is expected to browse through only few pages
not isolated from underlying changes
latency degrades linearly
Operating on Full Result Set
This approach takes off from the previous one by reusing previous effort. The database takes a hit only on new query criteria, instead of every time the user changes pages.
This approach could be implemented by using either a temporary table or a without hold cursor. Both implementations require the webapp to maintain and reuse the transaction in which the table or cursor is defined.
A common strategy is to maintain a fixed number of connections to the database and assign one connection to the processing of a query in a round-robin way, i.e. map a specific query criteria to a specific connection.
In each connection, a transaction is held open throughout the duration of the webapp. This transaction would hold various temporary tables or cursors. You would want to keep the transaction open as long as possible. Warning: keeping a transaction open for a long time would have the
following negative side-effects:
prevents vacuum from removing all dead tuples.
blocks changes to schema
may block other transactions if data is modified within the transaction
Moreover, transactions may become invalid at any time due to some unforeseen event like Bob spilling his soda over the ethernet switch.
Therefore, your webapp should be able to re-connect and re-setup the temporary tables or cursors setup if the existing connection or transaction is no longer valid.
Being able to re-setup would also allow the DBA to vacuum thoroughly and/or make schema updates by simply killing and temporarily blocking connections from your webapp during low-traffic hours without having to restart your webapp. This is a big deal if the DBA person is not the sysadmin or have permission to restart your webapp.
Before processing each query, it is recommended to generate a SAVEPOINT so that any error in processing a query would not destroy the transaction.
Using Temporary Tables
The result set could be piped into a temporary table via the CREATE TEMPORARY TABLE foo AS command. It is important to remember to use a temporary table since it is not journalled into the WAL (write-ahead logging) which would have negative impact on performance.
The implementation gives you a free count of matching result when you do the CREATE TEMPORARY TABLE AS. I am not sure why psql does not show the count, but it is accessible from within a stored procedure or your DB driver. cheap query
dms3_test=> abort;
begin;
create temporary table foo as select * from cheap order by id;
select * from foo order by id offset 0 limit 10;
select * from foo order by id offset 50000 limit 10;
ROLLBACK
Time: 60.744 ms
BEGIN
Time: 0.686 ms
SELECT
Time: 15125.956 ms
id
----
1
2
3
4
5
6
9
10
11
12
(10 rows)
Time: 4397.762 ms
id
-------
50003
50004
50005
50006
50007
50008
50009
50010
50011
50012
(10 rows)
Time: 4413.789 ms
expensive query
dms3_test=> abort;
begin;
create temporary table foo as select * from expensive order by id;
select * from foo order by id offset 0 limit 10;
select * from foo order by id offset 50000 limit 10;
ROLLBACK
Time: 52.777 ms
BEGIN
Time: 3.683 ms
SELECT
Time: 18666.615 ms
id
-----
6
50
55
65
89
109
110
133
144
155
(10 rows)
Time: 314.754 ms
id
--------
749659
749661
749667
749685
749692
749720
749732
749740
749741
749778
(10 rows)
Time: 342.207 ms
Characteristics:
complex implementation
high setup cost
free result count as a side-effect
suitable for expensive queries
suitable if user is expected to comprehensively browse the result set
isolated from underlying changes
latency still degrades linearly but more gently
Using Without Hold Cursors
Without hold cursors are destroyed at the end of transaction, similar to temporary tables. On the other hand, with hold cursors outlive the creating transaction, although they are still bounded within a session. I recommend using without hold cursors to simplify garbage management. cheap query
dms3_test=> abort;
begin;
declare cheap_cursor scroll cursor for select * from cheap order by id;
move all from cheap_cursor;
move first from cheap_cursor;
fetch 10 from cheap_cursor;
move absolute 50000 from cheap_cursor;
fetch 10 from cheap_cursor;
ROLLBACK
Time: 4.054 ms
BEGIN
Time: 0.970 ms
DECLARE CURSOR
Time: 1.022 ms
MOVE 1010431
Time: 12434.136 ms
MOVE 1
Time: 4.409 ms
id
----
2
3
4
5
6
9
10
11
12
13
(10 rows)
Time: 4.418 ms
MOVE 1
Time: 30.055 ms
id
-------
50003
50004
50005
50006
50007
50008
50009
50010
50011
50012
(10 rows)
Time: 3.875 ms
expensive query
dms3_test=> abort;
begin;
declare expensive_cursor scroll cursor for select * from expensive order by id;
move all from expensive_cursor;
move first from expensive_cursor;
fetch 10 from expensive_cursor;
move absolute 50000 from expensive_cursor;
fetch 10 from expensive_cursor;
ROLLBACK
Time: 2.044 ms
BEGIN
Time: 0.739 ms
DECLARE CURSOR
Time: 51.912 ms
MOVE 68276
Time: 19036.148 ms
MOVE 1
Time: 1.055 ms
id
-----
50
55
65
89
109
110
133
144
155
186
(10 rows)
Time: 0.911 ms
MOVE 1
Time: 30.226 ms
id
--------
749659
749661
749667
749685
749692
749720
749732
749740
749741
749778
(10 rows)
Time: 1.736 ms
Characteristics:
complex implementation
high setup cost
suitable for expensive queries
suitable if user is expected to comprehensively browse the result set
isolated from underlying changes
barely noticeable latency
Hybrid Approach
One could do a hybrid approach. The implementation would be even more complex, but in some cases, it could combine the no setup cost benefit of the piecemeal approach and the low latency of the full result approach.
The hybrid approach would operate on piecemeal result set until a certain threshold is reached, e.g.: paging past page 7. When that happens, one of the full result set approach is executed, preferably in the background. The webapp could transition to using the full result set when it is ready.
Summary
Summary of Implementations
Query Type
Implementation
Setup(ms)
Counting(ms)
First Page(ms)
5000th Page(ms)
Cheap
New query per page
N/A
1094.187
57.371
134.610
Temporary Table
15125.956
N/A
4397.762
4413.789
Cursor
1.022
12434.136
8.827
33.930
Expensive
New query per page
N/A
18034.510
76.929
14424.053
Temporary Table
18666.615
N/A
314.754
342.207
Cursor
51.921
19036.148
1.966
31.962
(originally from http://microjet.ath.cx/WebWiki/ResultPaginationWithPostgresql.html)
I am using LVM2 with linux 2.6.11. One of the drive in the volume is going bad. I could hear screeching noise while the platters were spinning.
That is bad. Not a problem, though, as I can just vacate the data from that drive using pvmove.
The bad drive is /dev/bad. /dev/avail is a volume with some free space.
VG Name vg0
PV Size 148.09 GB / not usable 0
Allocatable yes
PE Size (KByte) 4096
Total PE 37911
Free PE 25111
Allocated PE 12800
PV UUID e5EvaO-0oo5-Zenl-KzkY-wFf7-EkQX-di1mXt
--- Physical volume ---
PV Name /dev/avail
VG Name vg0
PV Size 186.30 GB / not usable 0
Allocatable yes
PE Size (KByte) 4096
Total PE 47694
Free PE 20658
Allocated PE 27036
PV UUID HLtCMK-751U-IFAW-5Aj3-FQ3w-5tqO-8svvwt
Let's move it to the only drive with available space, /dev/avail. /dev/bad has 12800 allocated PE while /dev/avail has 20658 free PE. I was not expecting any problem fitting the data in /dev/bad into /dev/avail.
# pvmove -i 5 -v /dev/bad
Finding volume group "vg0"
Archiving volume group "vg0" metadata.
Creating logical volume pvmove0
Moving 0 extents of logical volume vg0/lv0
Insufficient contiguous allocatable extents (1777) for logical
volume pvmove0: 12800 required
Unable to allocate temporary LV for pvmove.
Urk? It needs to be contiguous? What to do now?
Searching through the lvm mailing list shows that pvmove is dumb. It only sees the first free PE (physical extent). OK, let's work around this.
pv0 is the physical volume corresponding to /dev/bad. From this, we see that there are three segments residing in pv0. The first starts at PE 30720.
Let's try to fill that 1777 free PE on the dest drive, /dev/avail. That means, we'll be moving PE 30720 to (30720+1777-1=32496) from pv0.
# pvmove -i 5 -v /dev/bad:30720-32496
Finding volume group "vg0"
Archiving volume group "vg0" metadata.
Creating logical volume pvmove0
Moving 0 extents of logical volume vg0/lv0
Moving 1777 extents of logical volume vg0/lv1
Moving 0 extents of logical volume vg0/lv2
Moving 0 extents of logical volume vg0/lv3
Found volume group "vg0"
Updating volume group metadata
Creating volume group backup "/etc/lvm/backup/vg0"
Found volume group "vg0"
Found volume group "vg0"
Loading vg0-pvmove0
Found volume group "vg0"
Loading vg0-lv1
Checking progress every 5 seconds
/dev/hdf2: Moved: 7.7%
/dev/hdf2: Moved: 14.4%
/dev/hdf2: Moved: 22.6%
/dev/hdf2: Moved: 29.7%
/dev/hdf2: Moved: 37.4%
/dev/hdf2: Moved: 44.6%
/dev/hdf2: Moved: 52.3%
/dev/hdf2: Moved: 60.0%
/dev/hdf2: Moved: 67.7%
/dev/hdf2: Moved: 75.4%
/dev/hdf2: Moved: 83.1%
/dev/hdf2: Moved: 90.3%
/dev/hdf2: Moved: 97.4%
/dev/hdf2: Moved: 100.0%
Found volume group "vg0"
Found volume group "vg0"
Found volume group "vg0"
Loading vg0-pvmove0
Found volume group "vg0"
Loading vg0-lv1
Found volume group "vg0"
Found volume group "vg0"
Removing temporary pvmove LV
Writing out final volume group after pvmove
Creating volume group backup "/etc/lvm/backup/vg0"
Finally, after repeating the above process for the remaining segments, /dev/bad, aka pv0, is free of data and is safe to take down.
# vgreduce vg0 /dev/bad
Toss it in the garbage bin.
(originally from http://microjet.ath.cx/WebWiki/pvmove%20problem.html)
When you do M-x foo in emacs, you are asking emacs to execute an interactive function named "foo". The function must be declared in a special way as follow:
The form on the second line, "(interactive)", tells emacs that this function is an interactive function.
But how does the emacs knows you put that form in that function? It could try to execute the function, but it may produce an unwanted side-effect (like executing "rm -rf /" if the condition is met).
Could you do a partial execution, e.g.: when a function is defined, execute only the first form and don't execute the rest? But that does not explain the following capability:
When you do M-x foo/2, emacs would first run the read-string function, prompt the user, get the answer from user, and then run the function foo/2. But if you call foo/2 non-interactively by calling it from another function like in call-foo/2, emacs will not prompt the user at all.
Well, that blows the hypotheses that when a function is defined, emacs executes only the first form. There could be no execution whatsoever since doing so may cause severe side-effect. What if the execute form in call-foo/3 is executed? Bad, bad, bad.
So, where is the magic?
The magic lies in macro
defun is actually a special form call. A special form is a form that may be implemented and/or executed differently. If a form is implemented in non-elisp language, it is a special form. If a form is not a function call form, it is a special form (ordinarily, '(something)' in elisp would execute the function 'something').
A special form basically can do anything to its body (macro), including scanning for the interactive form when it is called, just like what the special form defun does.
(originally from http://microjet.ath.cx/WebWiki/HowDoesEmacsKnowWhichFunctionsAreInteractive.html)