1

Topic: with productivity STRING_SPLIT

Thought built in STRING_SPLIT it will be faster explicit my bicycle (I do not remember whence  and me ).

create FUNCTION [dbo]. [fn_splitter] (@String NVARCHAR (4000), @ch char)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
WITH Split (stpos, endpos)
AS (
SELECT 0 AS stpos, CHARINDEX (@ch, @String) AS endpos
UNION ALL
SELECT endpos + 1 AS stpos, CHARINDEX (@ch, @String, endpos + 1) AS endpos
FROM Split
WHERE endpos> 0
)
SELECT ROW_NUMBER () OVER (ORDER BY (SELECT 1)) AS Id;
LTRIM (RTRIM (SUBSTRING (@String, stpos, COALESCE (NULLIF (endpos, 0), LEN (@String) + 1) - stpos))) AS Item
FROM Split
)
GO

In table 176  values with gaps, an insertion 410 values.

select l.lake_name, f. Item into testa0 from lake l
cross apply dbo.fn_splitter1 (l.lake_name, ' ') f
where CHARINDEX (' ', l.lake_name)> 0

With my function an insertion 1 , with   7 seconds

select l.lake_name, f.value into testb0 from lake l
cross apply STRING_SPLIT (l.lake_name, ' ') f
where CHARINDEX (' ', l.lake_name)> 0

Where an ambush?

2

Re: with productivity STRING_SPLIT

Lepsik;
CHARINDEX? STRING_SPLIT does not demand check.

3

Re: with productivity STRING_SPLIT

Lepsik;
STRING_SPLIT In current implementation that still a brake, but your "bicycle" still .
So you, probably, somewhere confused something.

4

Re: with productivity STRING_SPLIT

Kolosov wrote:

Lepsik;
CHARINDEX? STRING_SPLIT does not demand check.

How where something affects? Without it the same result. 1:7

5

Re: with productivity STRING_SPLIT

invm wrote:

Lepsik;
STRING_SPLIT In current implementation that still a brake, but your "bicycle" still .
So you, probably, somewhere confused something.

Is what in essence to tell?  I laid out.

6

Re: with productivity STRING_SPLIT

Lepsik wrote:

it is passed...
Is what in essence to tell?  I laid out.

I launched, on small table (1000) function more slowly in 5 times
On big (1000000)   in 10 times (6 and 60 seconds)
On the small table, but with the big lines (1000 on 8000 characters),  it is fulfilled 180 seconds, and function falls The maximum recursion 100 has been exhausted before statement completion.
Well and generally (if to correct maxrecursion) in it errors: Invalid length parameter passed to the LEFT or SUBSTRING function.

Lepsik wrote:

Where an ambush?

Hm, any ambush,  more slowly your function under certain conditions.
But it works without errors.

7

Re: with productivity STRING_SPLIT

Lepsik wrote:

Is what in essence to tell?  I laid out.

that "".
is that it is possible to take, launch and see (or not to see) a problem. So to yours  to  it is still very far.
At a forum there are subjects with arguing and examples of various variants of functions of a partition of a line. Find to itself among them approaching.

8

Re: with productivity STRING_SPLIT

STRING_SPLIT It is intended on the spirit for conversion of an input parameter to the tabular form. It "is possible" it "is not equally necessary". Storage in the table of not atomic data - heresy at its finest. wink

9

Re: with productivity STRING_SPLIT

Kolosov wrote:

STRING_SPLIT is intended on the spirit for conversion of an input parameter to the tabular form. It "is possible" it "is not equally necessary". Storage in the table of not atomic data - heresy at its finest. wink

Yes, the decision to make its slow therefore was accepted :-)
On mine, simply it was done  by programmers, moreover for certain by bypass ways (type,  through.NET)
Same it is necessary, to make its comparable or even slower, than  the recursive requests in the lines on T-SQL
Shockingly.

10

Re: with productivity STRING_SPLIT

If asked me I would answer that I use for the same purposes SplitString_Multi from Adam Machanic. And that really is not present trust to Hindus smile

11

Re: with productivity STRING_SPLIT

Vladislav Kolosov;
If asked me I would answer that if you permanently should use , at you  problem in architecture.

12

Re: with productivity STRING_SPLIT

invm wrote:

it is passed...
That "".
is that it is possible to take, launch and see (or not to see) a problem. So to yours  to  it is still very far.
At a forum there are subjects with arguing and examples of various variants of functions of a partition of a line. Find to itself among them approaching.

For example:
Function which divides a line into words

13

Re: with productivity STRING_SPLIT

TaPaK;
I about the same. It is necessary to store the atomic data in tables. It is required to break lists of listings for procedures  service.

14

Re: with productivity STRING_SPLIT

alexeyvg wrote:

On mine, simply it was done  by programmers

Yes. Only in other place smile
1. Function split_string really brake;
2. Nevertheless, split_string much faster functions ';
3. There is an aberrant behavior split_string in parallel plans.
[spoiler Repro]

use tempdb;
go
if object_id (' dbo.fn_splitter ', ' IF ') is not null
drop function dbo.fn_splitter;
go
create FUNCTION [dbo]. [fn_splitter] (@String NVARCHAR (4000), @ch char)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
WITH Split (stpos, endpos)
AS (
SELECT 0 AS stpos, CHARINDEX (@ch, @String) AS endpos
UNION ALL
SELECT endpos + 1 AS stpos, CHARINDEX (@ch, @String, endpos + 1) AS endpos
FROM Split
WHERE endpos> 0
)
SELECT ROW_NUMBER () OVER (ORDER BY (SELECT 1)) AS Id;
LTRIM (RTRIM (SUBSTRING (@String, stpos, COALESCE (NULLIF (endpos, 0), LEN (@String) + 1) - stpos))) AS Item
FROM Split
)
GO
/ ************************/
/* Settings of the Test */
/ ************************/
declare
@s nvarchar (max);
@pl int = 30;
@pc int = 5;
@rc int = 100000;
/ ************************/
select
@s = stuff ((
select top (@pc) ', ' + right (replicate (' 0 ', @pl) + cast (row_number () over (order by (select 1)) as nvarchar (30)), @pl) from (select a.number from master.dbo.spt_values a cross join master.dbo.spt_values b) t for xml path ("), type
).value ('. ', ' nvarchar (max) '), 1, 1, ");
if object_id (' tempdb..#t ', ' U ') is not null
drop table #t;
select top (@rc)
row_number () over (order by (select 1)) as n, @s as s
into
#t
from
master.dbo.spt_values a cross join
master.dbo.spt_values b;
go
if object_id (' dbo.t1 ', ' U ') is not null
drop table dbo.t1;
if object_id (' dbo.t2 ', ' U ') is not null
drop table dbo.t2;
if object_id (' dbo.t3 ', ' U ') is not null
drop table dbo.t3;
if object_id (' dbo.t4 ', ' U ') is not null
drop table dbo.t4;
go
set statistics xml, time on;
go
select t.n, a. Item into dbo.t1 from #t t cross apply dbo.fn_splitter (t.s, ', ') an option (querytraceon 8649);
select t.n, a. Item into dbo.t2 from #t t cross apply dbo.fn_splitter (t.s, ', ') an option (maxdop 1);
select t.n, a. Value into dbo.t3 from #t t cross apply string_split (t.s, ', ') an option (querytraceon 8649);
select t.n, a. Value into dbo.t4 from #t t cross apply string_split (t.s, ', ') an option (maxdop 1);
go
set statistics xml, time off;
go
drop function dbo.fn_splitter;
drop table dbo.t1, dbo.t2, dbo.t3, dbo.t4;
go

[/spoiler]

15

Re: with productivity STRING_SPLIT

invm;

wrote:

3. There is an aberrant behavior split_string in parallel plans.

On what to look?

16

Re: with productivity STRING_SPLIT

TaPaK wrote:

on what to look?

On CPU at serial and parallel plans.

17

Re: with productivity STRING_SPLIT

TaPaK wrote:

Vladislav Kolosov;
If asked me I would answer that if you permanently should use , at you  problem in architecture.

Present that your operation - to write  where as arguments can  lists.
Architecture bad?

18

Re: with productivity STRING_SPLIT

Lepsik wrote:

it is passed...
Present that your operation - to write  where as arguments can  lists.
Architecture bad?

You  understand, what the architecture means?

19

Re: with productivity STRING_SPLIT

invm wrote:

it is passed...
On CPU at serial and parallel plans.

Well as on me here not because of STRING_SPLIT, and from for insertions.

20

Re: with productivity STRING_SPLIT

TaPaK wrote:

Well as on me here not because of STRING_SPLIT, and from for insertions.

Explain, what exactly caused growth CPU time in the parallel plan with string_split, in comparison with serial and did not cause in similar with fn_splitter.

21

Re: with productivity STRING_SPLIT

invm wrote:

it is passed...
Explain, what exactly caused growth CPU time in the parallel plan with string_split, in comparison with serial and did not cause in similar with fn_splitter.

Well I looked only on string_split.
The operator in the parallel plan gives 4572/299 against 2145/2145 in the serial. It on MAXDOP 16. To speak about aberrant behavior at least it is strange.

22

Re: with productivity STRING_SPLIT

invm;
Well and

INSERT INTO dbo.t1 select t.n, a. Item from #t t cross apply dbo.fn_splitter (t.s, ', ') an option (querytraceon 8649, MAXDOP 16);
INSERT INTO dbo.t2 select t.n, a. Value from #t t cross apply string_split (t.s, ', ') an option (use hint (' ENABLE_PARALLEL_PLAN_PREFERENCE '), MAXDOP 16);

Gives absolutely reverse pattern,  a question to SELECT INTO which as on me  smile

23

Re: with productivity STRING_SPLIT

TaPaK;
At me for string_split parllelnyj / serial CPU time - 7689/3078
And it should be comparable. As it is visible for fn_splitter - 6155/5750

24

Re: with productivity STRING_SPLIT

invm wrote:

TaPaK;
At me for string_split parllelnyj / serial CPU time - 7689/3078
And it should be comparable. As it is visible for fn_splitter - 6155/5750

You about performance entirely?

25

Re: with productivity STRING_SPLIT

invm wrote:

1. Function split_string really brake;
2. Nevertheless, split_string much faster functions ';
3. There is an aberrant behavior split_string in parallel plans.

And so at the HARDWARE the function faster split_string, from for  last in parallel plans also turns out. It and on yours  is visible.
In 9 times  if split_string it is fulfilled parallely, and in 5 times - if it is fulfilled in one flow, and function ' in many (there, clearly, itself split_string faster, but function ' a beret by parallel performance)
'
For one million lines:
function, queryoption, CPU, duration , reads
fn_splitter, querytraceon 8649,63406, 17820 , 46081026
fn_splitter, maxdop 1, 198781, 208033 , 46077786
string_split, querytraceon 8649,263141, 140406 , 6515962
string_split, maxdop 1, 79266, 87837 , 6515919