View Issue Details

IDProjectCategoryView StatusLast Update
0007516mantisbtdb mssqlpublic2014-05-16 15:00
Reporterrzg Assigned Togiallu  
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionsuspended 
Product Version1.1.0a1 
Summary0007516: My mssql performance tweaks
Description

As I mentioned in another bug, I had to make changes to filter_api to get SQL Server 2000 to perform acceptably. I'm posting those changes here so they don't get forgotten- I'm not sure if they're suitable for integration into the mainstream.

Prior to these changes, pulling up the "My View" for all projects (10000+ issues) took over 90 seconds. After these changes, the page can be pulled up in 1.5 seconds.

Steps To Reproduce

C:\src\mantis>cvs diff -ub -r 1.141 core\filter_api.php
Empty password used - try 'cvs login' with a real password
Index: core/filter_api.php

RCS file: /cvsroot/mantisbt/mantisbt/core/filter_api.php,v
retrieving revision 1.141
diff -u -b -r1.141 filter_api.php
--- core/filter_api.php 6 May 2006 02:58:43 -0000 1.141
+++ core/filter_api.php 12 Oct 2006 17:35:52 -0000
@@ -778,7 +778,9 @@
$rows = array();

            if ( $bug_count > 0 ) {
  • $t_where = "WHERE $t_bug_table.id in (" . implode( ", ",
    $t_id_array ) . ")";
    +#We've already built this query, so use it as a subquery

  • #$t_where = "WHERE $t_bug_table.id in (" . implode( ", "
    , $t_id_array ) . ")";
    +$t_where = "WHERE $t_bug_table.id in (" .$query . ")";
    } else {
    return $rows;
    }
    @@ -875,7 +877,7 @@
    $t_order = " ORDER BY " . implode( ', ', $t_order_array );
    $t_select = implode( ', ', array_unique( $t_select_clauses
    ) );

  • $query2 = "SELECT DISTINCT $t_select

  • $query2 = "SELECT $t_select
    $t_from
    $t_join
    $t_where
    @@ -913,7 +915,9 @@
    // paulr: it should be impossible for t_id_array_lastmod to be a
    rray():
    // that would imply that $t_id_array is null which aborts this f
    unction early
    //if ( count( $t_id_array_lastmod ) > 0 ) {

  • $t_where = "WHERE $t_bugnote_table.bug_id in (" . implode( ", ",
    $t_id_array_lastmod ) . ")";

  • //$t_where = "WHERE $t_bugnote_table.bug_id in (" . implode( ",
    ", $t_id_array_lastmod ) . ")";

  • $t_where = "WHERE $t_bugnote_table.bug_id in (" . $query . ")";

  •         $query3 = "SELECT DISTINCT bug_id,MAX(last_modified) as last_mod

    ified, COUNT(last_modified) as count FROM $t_bugnote_table $t_where GROUP BY bug
    _id";

Tagspatch
Attached Files
optimizepatch.patch (3,059 bytes)   
Index: core/filter_api.php
===================================================================
RCS file: /cvsroot/mantisbt/mantisbt/core/filter_api.php,v
retrieving revision 1.163
diff -u -b -r1.163 filter_api.php
--- core/filter_api.php	15 Sep 2007 21:48:26 -0000	1.163
+++ core/filter_api.php	17 Sep 2007 13:59:49 -0000
@@ -1245,26 +1245,19 @@
 			$t_where	= '';
 		}
 
-		# Possibly do two passes. First time, grab the IDs of issues that match the filters. Second time, grab the IDs of issues that
-		# have bugnotes that match the text search if necessary.
+
 		$t_id_array = array();
-		for ( $i = 0; $i < 2; $i++ ) {
 			$t_id_where = $t_where;
 			$t_id_join = $t_join;
-			if ( $i == 0 ) {
-				if ( !is_blank( $t_id_where ) && !is_blank( $t_textsearch_where_clause ) ) {
-					$t_id_where = $t_id_where . ' AND ' . $t_textsearch_where_clause;
-				}
-			} else if ( !is_blank( $t_textsearch_wherejoin_clause ) ) {
+            if ( !is_blank( $t_textsearch_wherejoin_clause ) ) {
 				$t_id_where = $t_id_where . ' AND ' . $t_textsearch_wherejoin_clause;
-				$t_id_join = $t_id_join . " INNER JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
-				$t_id_join = $t_id_join . " INNER JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";
+				$t_id_join = $t_id_join . " LEFT JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
+				$t_id_join = $t_id_join . " LEFT JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";
 			}
 			$query  = "SELECT DISTINCT $t_bug_table.id AS id
 						$t_from
 						$t_id_join
 						$t_id_where";
-			if ( ( $i == 0 ) || ( !is_blank( $t_textsearch_wherejoin_clause ) ) ) {
 				$result = db_query( $query );
 				$row_count = db_num_rows( $result );
 
@@ -1272,10 +1265,7 @@
 					$row = db_fetch_array( $result );
 					$t_id_array[] = db_prepare_int ( $row['id'] );
 				}
-			}
-		}
 
-		$t_id_array = array_unique( $t_id_array );
 
 		# Get the total number of bugs that meet the criteria.
 		$bug_count = count( $t_id_array );
@@ -1283,7 +1273,7 @@
 		$rows = array();
 
 		if ( $bug_count > 0 ) {
-			$t_where = "WHERE $t_bug_table.id in (" . implode( ", ", $t_id_array ) . ")";
+        	$t_where = "WHERE $t_bug_table.id in (" .$query . ")";
 		} else {
 			return $rows;
 		}
@@ -1420,7 +1410,9 @@
 		// paulr: it should be impossible for t_id_array_lastmod to be array():
 		// that would imply that $t_id_array is null which aborts this function early
 		//if ( count( $t_id_array_lastmod ) > 0 ) {
-		$t_where = "WHERE $t_bugnote_table.bug_id in (" . implode( ", ", $t_id_array_lastmod ) . ")";
+		//$t_where = "WHERE $t_bugnote_table.bug_id in (" . implode( ", ", $t_id_array_lastmod ) . ")";
+		$t_where = "WHERE $t_bugnote_table.bug_id in (" . $query . ")";
+
 		
 		$query3 = "SELECT DISTINCT bug_id,MAX(last_modified) as last_modified, COUNT(last_modified) as count FROM $t_bugnote_table $t_where GROUP BY bug_id";
 
optimizepatch.patch (3,059 bytes)   
bug7516.patch (4,573 bytes)   
Index: core/filter_api.php
===================================================================
RCS file: /cvsroot/mantisbt/mantisbt/core/filter_api.php,v
retrieving revision 1.163
diff -u -r1.163 filter_api.php
--- core/filter_api.php	15 Sep 2007 21:48:26 -0000	1.163
+++ core/filter_api.php	28 Sep 2007 11:46:27 -0000
@@ -1245,47 +1245,26 @@
 			$t_where	= '';
 		}
 
-		# Possibly do two passes. First time, grab the IDs of issues that match the filters. Second time, grab the IDs of issues that
-		# have bugnotes that match the text search if necessary.
-		$t_id_array = array();
-		for ( $i = 0; $i < 2; $i++ ) {
-			$t_id_where = $t_where;
-			$t_id_join = $t_join;
-			if ( $i == 0 ) {
-				if ( !is_blank( $t_id_where ) && !is_blank( $t_textsearch_where_clause ) ) {
-					$t_id_where = $t_id_where . ' AND ' . $t_textsearch_where_clause;
-				}
-			} else if ( !is_blank( $t_textsearch_wherejoin_clause ) ) {
-				$t_id_where = $t_id_where . ' AND ' . $t_textsearch_wherejoin_clause;
-				$t_id_join = $t_id_join . " INNER JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
-				$t_id_join = $t_id_join . " INNER JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";
-			}
-			$query  = "SELECT DISTINCT $t_bug_table.id AS id
-						$t_from
-						$t_id_join
-						$t_id_where";
-			if ( ( $i == 0 ) || ( !is_blank( $t_textsearch_wherejoin_clause ) ) ) {
-				$result = db_query( $query );
-				$row_count = db_num_rows( $result );
-
-				for ( $j=0; $j < $row_count; $j++ ) {
-					$row = db_fetch_array( $result );
-					$t_id_array[] = db_prepare_int ( $row['id'] );
-				}
-			}
-		}
 
-		$t_id_array = array_unique( $t_id_array );
+		$t_id_where = $t_where;
+		$t_id_join = $t_join;
+		if ( !is_blank( $t_textsearch_wherejoin_clause ) ) {
+			$t_id_where = $t_id_where . ' AND ' . $t_textsearch_wherejoin_clause;
+			$t_id_join .= " LEFT JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
+			$t_id_join .= " LEFT JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";
+		}
+		$query  = "SELECT DISTINCT $t_bug_table.id AS id
+			$t_from
+			$t_id_join
+			$t_id_where";
+		$result = db_query( $query );
 
 		# Get the total number of bugs that meet the criteria.
-		$bug_count = count( $t_id_array );
-
-		$rows = array();
-
+		$bug_count = db_num_rows( $result );
 		if ( $bug_count > 0 ) {
-			$t_where = "WHERE $t_bug_table.id in (" . implode( ", ", $t_id_array ) . ")";
+			$t_where = "WHERE $t_bug_table.id in (" .$query . ")";
 		} else {
-			return $rows;
+			return array();
 		}
 
 		$t_from = 'FROM ' . $t_bug_table;
@@ -1402,40 +1381,34 @@
 		$result2 = db_query( $query2, $c_per_page, $t_offset );
 
 		$row_count = db_num_rows( $result2 );
-
-		$t_id_array_lastmod = array();
 		
+		$t_filtered_rows = array();
 		for ( $i=0 ; $i < $row_count ; $i++ ) {
 			$row = db_fetch_array( $result2 );
-			$t_id_array_lastmod[] = db_prepare_int ( $row['id'] );
 			
 			$row['date_submitted'] = db_unixtimestamp ( $row['date_submitted'] );
 			$row['last_updated'] = db_unixtimestamp ( $row['last_updated'] );
 					
-			array_push( $rows, $row );
+			array_push( $t_filtered_rows, $row );
 		}
 
-		$t_id_array_lastmod = array_unique( $t_id_array_lastmod );
-		
-		// paulr: it should be impossible for t_id_array_lastmod to be array():
-		// that would imply that $t_id_array is null which aborts this function early
-		//if ( count( $t_id_array_lastmod ) > 0 ) {
-		$t_where = "WHERE $t_bugnote_table.bug_id in (" . implode( ", ", $t_id_array_lastmod ) . ")";
-		
-		$query3 = "SELECT DISTINCT bug_id,MAX(last_modified) as last_modified, COUNT(last_modified) as count FROM $t_bugnote_table $t_where GROUP BY bug_id";
+
+		#####
+		# Third query
+		# Collect the number of notes associated to each bug
+		$t_where = "WHERE $t_bugnote_table.bug_id IN (" . $query . ")";
+		$query3 = "SELECT DISTINCT bug_id, MAX(last_modified) AS last_modified, COUNT(last_modified) AS count FROM $t_bugnote_table $t_where GROUP BY bug_id";
 
 		# perform query
 		$result3 = db_query( $query3 );
-
 		$row_count = db_num_rows( $result3 );
 
 		for ( $i=0 ; $i < $row_count ; $i++ ) {
 			$row = db_fetch_array( $result3 );
-			
 			$t_stats[ $row['bug_id'] ] = $row;
 		}
 
-		foreach($rows as $row) {
+		foreach($t_filtered_rows as $row) {
 			if( !isset( $t_stats[ $row['id'] ] ) ) {
 				bug_cache_database_result( $row, false );
 			} else {
@@ -1443,7 +1416,7 @@
 			}
 		}
 
-		return $rows;
+		return $t_filtered_rows;
 	}
 
 	# --------------------
bug7516.patch (4,573 bytes)   

Relationships

related to 0007716 closedgrangeway Can not handel Querry with over 3000 bugs 
related to 0007933 closedgrangeway Filter does not work as expected 

Activities

vboctor

vboctor

2006-10-13 12:26

manager   ~0013614

Maybe we need to have a flag that indicates if the used DBMS supports nested queries or not, if so, then we use sub-queries to improve performance. I wonder if such flag is already supported in ADODB library. If so, then we should definately use it.

I haven't review the above patch in details, but the main concept in as far as I quickly see is the use of nested queries.

exk72

exk72

2007-01-07 03:31

reporter   ~0013900

Some info for MySQL:

http://dev.mysql.com/doc/refman/5.0/en/subquery-restrictions.html
http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-subqueries.html

Newer versions of ADODB; ie 4.93, have introduced a _nestedSQL property. It does not exist in the MySQL driver (yet?).

edwardgao

edwardgao

2007-05-02 15:20

reporter   ~0014421

This fix causes http://bugs.mantisbt.org/view.php?id=7933.

Do you have any other fixes to fix both.

Thanks

rzg

rzg

2007-05-02 23:39

reporter   ~0014424

At first I couldn't reproduce the issue on our company Mantis, but now I see it.

It seems related to the handling of joins in the subquery- I'm not sure if it is a bug in SQL2000 or Mantis, I have to really look at the query more. I'll also try it on a SQL2005 machine when I have a chance. Mixing ANSI-style joins and non-ANSI style joins is bound to eventually cause issues - either with the DBMS or the sanity of the programmer. :-)

You can fix it by changing the lines above the problem change to "LEFT" instead of "INNER" joins.
$t_id_join = $t_id_join . " LEFT JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
$t_id_join = $t_id_join . " LEFT JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";

rzg

rzg

2007-05-03 00:03

reporter   ~0014425

Ok, I see how the code works. We do 2 passes- one with the join to the bugnotes and one without, and then we eliminate the duplicates in PHP. My optimization only picks up the second of the 2 queries, and therefore misses any cases of text searches where there are no bugnotes.

Making it a left join and skipping the second query in the loop is probably the correct solution, since the whole "run it twice" thing seems hacky, especially because you're returning the same results from both queries most of the time.

rzg

rzg

2007-05-03 00:14

reporter   ~0014426

That whole block of code, IMHO, should look like this. Much more readable:

    # Possibly do two passes. First time, grab the IDs of issues that match the filters. Second time, grab the IDs of issues that
    # have bugnotes that match the text search if necessary.
    $t_id_array = array();
        $t_id_where = $t_where;
        $t_id_join = $t_join;
        if ( !is_blank( $t_textsearch_wherejoin_clause ) ) {
            $t_id_where = $t_id_where . ' AND ' . $t_textsearch_wherejoin_clause;
            $t_id_join = $t_id_join . " LEFT JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
            $t_id_join = $t_id_join . " LEFT JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";
        }
        $query  = "SELECT DISTINCT $t_bug_table.id AS id
                    $t_from
                    $t_id_join
                    $t_id_where";
            $result = db_query( $query );
            $row_count = db_num_rows( $result );

            for ( $j=0; $j < $row_count; $j++ ) {
                $row = db_fetch_array( $result );
                $t_id_array[] = db_prepare_int ( $row['id'] );
            }

    # Get the total number of bugs that meet the criteria.
    $bug_count = count( $t_id_array );

    $rows = array();

    if ( $bug_count > 0 ) {

#We've already built this query, so use it as a subquery
$t_where = "WHERE $t_bug_table.id in (" .$query . ")";
} else {
return $rows;
}

giallu

giallu

2007-09-15 18:56

reporter   ~0015643

So, nested queries are now OK to use and I'd like to fix this for 1.1

Could you please attach a patch in a proper format for me to review? I see a lot of code but it is not clear what I should review/apply to CVS.

I'd really appreciate if you could follow the directions here:

http://www.mantisbt.org/wiki/doku.php/mantisbt:howto_submit_patches

and submit a proper patch

rzg

rzg

2007-09-17 12:43

reporter   ~0015652

I've attached this as a patch. I do warn that this needs to be tested, both for performance and functionality, with mysql. I expect that it will work fine. It works fine for mssql, and I can pull up the "My View" screen for 0006844:0018000 issues in under 2 seconds, and a full text search (for all statuses) in about 13s.

giallu

giallu

2007-09-17 16:47

reporter   ~0015655

Great.
I am going to review the code and test it with MySQL, so we can fix this in time for 1.1

vboctor

vboctor

2007-09-26 02:11

manager   ~0015717

Some forum thread relating to performance issues:
http://www.mantisbt.org/forums/viewtopic.php?p=10270

vboctor

vboctor

2007-09-26 02:20

manager   ~0015718

Another related forum thread:
http://www.mantisbt.org/forums/viewtopic.php?p=10273

vboctor

vboctor

2007-09-26 02:22

manager   ~0015719

Yet another thread:
http://www.mantisbt.org/forums/viewtopic.php?p=10274

giallu

giallu

2007-09-28 07:55

reporter   ~0015754

Ok. In the revised patch I added some more changes because after applying the proposed patch some part of the code were just useless.

I also made few minor modifications related to code readability.

I tested the code but I will appreciate someone else willing to try the patch before I'll commit in CVS.

exk72

exk72

2007-09-28 08:34

reporter   ~0015756

Minor possible performance suggestion, use (!$result->EOF) instead of db_num_rows() to know if or how many times to loop.

vboctor

vboctor

2007-09-29 02:15

manager   ~0015763

I've tried the latest patch provided to me by giallu today (28th of September) and for some reason it seems to be slower than the old one, it also runs the same number of queries. I've noticed that we run a query to get the number of elements, then re-run it again as a nested query. I would have thought that we will run this query once as a nested query, hence, we should run 1 less query.

Following are three measurements based on the new and the old filter API.

NEW FILTER_API (37 queries)

1.1877 SQL Queries Total Time
2.4224 Page Request Total Time

1.5427 SQL Queries Total Time
3.3641 Page Request Total Time

1.4332 SQL Queries Total Time
2.567 Page Request Total Time

OLD FILTER_API (37 queries)

0.2486 SQL Queries Total Time
1.6529 Page Request Total Time

0.2755 SQL Queries Total Time
1.993 Page Request Total Time

0.478 SQL Queries Total Time
2.7048 Page Request Total Time

I haven't did some minor testing on the results of the queries (in terms of items returned), but no thorough.

giallu

giallu

2007-09-29 03:27

reporter   ~0015765

Well, I can't pretend I expected that but...

I noticed in the filter_get_bug_rows function 3 queries are run (this does not change after the patch); the difference is that after the patch the first one is used as subquery for the other two: now, since the first is the "heavy" one, I was wondering why it could be faster than before, since the heavy query is now run 3 times, instead of only once.

Even if we manage to not run the first one (but the num of rows from it is used to calculate the pagination parameters, so we should re-arrange the code a little bit), it will still run 2 heavy queries instead of one.

I should try to further analyze the actual queries, but you know, we are talking about the single 1100 lines filters function... I doubt it will make the cut for rc1

vboctor

vboctor

2007-09-29 03:32

manager   ~0015767

Lets see how we go, if we come up with something we are comfortable with, then fine, otherwise it won't make it in 1.1.0rc1. Understanding filter_api better would be great, since it will help us later to refactor this over complicated part of the code.

rzg

rzg

2007-09-30 17:49

reporter   ~0015777

The worse performance on MySQL is somewhat expected because of limitations in their query optimizer when dealing with subqueries. From http://dev.mysql.com/doc/refman/5.0/en/subquery-restrictions.html:

"Subquery optimization for IN is not as effective as for the = operator or for IN(value_list) constructs.
[...]
Consider the following statement that uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

The optimizer rewrites the statement to a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(M×N), rather than O(M+N) as it would be for an uncorrelated subquery.

An implication is that an IN subquery can be much slower than a query written using an IN(value_list) construct that lists the same values that the subquery would return."

The MSSQL query optimizer does not have the limitations that MySQL has, and subqueries like these perform just fine. MSSQL does have a limitation on the use of IN(value_list)- specifically if value_list contains more than a couple dozen values, the performance becomes very poor. So a fix for one DBMS will likely impact the other, and vice versa.

grangeway

grangeway

2007-11-11 08:26

reporter   ~0016154

hello,

RZG, could you see if the SVN HEAD code improves this issue for you ?

Paul

grangeway

grangeway

2008-06-28 17:48

reporter   ~0018222

I believe the issues raised in this are resolved in later releases/SVN TRUNK.

I know i've done some amount of work profiling database queries and php code across the code base.

If issues still remain in SVN trunk, can you log the remaining issues as new bugs.

Thanks
Paul

rzg

rzg

2008-06-29 00:14

reporter   ~0018223

I can't say for trunk, but as for 1.1.2, my patch is still required to get the database queries to perform reasonably.

(My company mantis is at about 25000 issues spread across about 40 queues, with about 100 users.)

giallu

giallu

2008-07-31 18:10

reporter   ~0018977

So, this looks anything but fixed ( and thanks to rzg for the great explanation why this is not an easy fix).

Maybe we can consider composing the best query based on the DB backend?

grangeway

grangeway

2008-09-05 17:00

reporter   ~0019318

gaillu: as far as i'm concerned, i'm pretty sure it's fixed in 1.2.x

Paul

giallu

giallu

2008-09-06 07:11

reporter   ~0019321

Well, given that filter api is undergoing a fair amount of changes, I'd close this and move on.

Let's open a new ticket afterwards if the code we will ship in 1.2 still performs badly

grangeway

grangeway

2014-05-16 15:00

reporter   ~0040398

MantisBT currently supports Mysql and has support for other database engines.

The support for other databases is known to be problematic.

Having implemented the current database layer into Mantis 10 years ago, I'm currently working on replacing the current layer.

If you are interested in using Mantis with non-mysql databases - for example, Oracle, PGSQL or MSSQL, and would be willing to help out testing the new database layer, please drop me an email at paul@mantisforge.org

In the meantime, I'd advise running Mantis with Mysql Only to avoid issues.

Thanks
Paul