Showing posts with label Algorithms. Show all posts
Showing posts with label Algorithms. Show all posts

Tuesday, November 5, 2024

Complexity Analysis

 Overview
Asymptotic Notation is used to describe the represent time complexity of an algorithm where it represents time taken to execute the algorithm for a given input, n. 
The following are some of the major notation systems: 
  1. Big O is used for the worst case running time.
  2. Big Theta (Θ) is used when the running time is the same for all cases.
  3. Big Omega (Ω) is used for the best case running time.
Out of the three, Big O notation is widely used. Besides Big O notation can also be used to represent space complexity where it represents memory consumption.

Details
Big O, pronounced as Big Oh, is commonly referred to as Order of.  It is a way to express the upper bound of an algorithm’s time complexity, since it analyses the worst-case situation of algorithm. It provides an upper limit on the time taken by an algorithm in terms of the size of the input represented as n.

For a given input of n, time complexity of various algorithms can be categorized under Big O notation as below. The performance ratio is represented as input time.

NameNotationDescriptionExample
ConstantO(1)Time  taken is constant.Assignments
System function calls
LogarithmicO(log n)Time increases linearly in 1:0.5 ratio.Binary search
LinearO(n)Time increases linearly in 1:1.Find an entry from an unsorted array
Log LinearO(n log (n))Time increases linearly in 1:1.5.Quick sort
PolynomialO(n^c)Time increases as a polynomial in 1:n^c ratio
where c can be 2 for quadric, 3 for cubic etc.
Selection Sort [ O(n^2) ].
ExponentialO(c^n)Time increases exponentially in 1:c^n ratio
where c can be 2 or higher.
Print all subsets of a given Set or Array [ O(2^n) ]. 
FactorialO(n!)Time increases factorially with n Permutations of  a set.

The following chart graphically compares   big o variations.

The following table compares input size vs time for various time complexities.


There are many ways to compute an algorithms time complexity. 

In case of simple algorithms, by counting each code statement that's executed. It can be simple assignment statements, if-else branches, switch statements, loops, system function calls etc.
It can be assumed that each statement is associated with a certain number of execution unit or eu.
For example, assignments, or system function calls can be considered that it takes 1 eu
In case of if-else , it's 2 eus based on the check.
The eu of a block of code can be considered as an aggregate of  eu of individual statements. However some variations apply on case to case basis.
Time complexity of an algorithm is calculated based on number of eus it generates for an input n.
Time is represents actual time taken to execute the algorithm. It can be represented as a polynomial expression such as  a*n^b + c*n^d +...+z.
The highest degree of the polynomial expression is considered for big O notation. 
It's discussed in detail below.

CodeExecution UnitsCategory
i=10;
v[20]=20;
int_vector.at(10)=5;


if (i > 10)
    return 1;
else
    return 2;
1
1
1

2
1
1
1
Constant

O(1)

for (int i=64; i>=1; i/=2)
{
    intlist.push_back(i);
    intlist2.push_back(i);
}


for (int i=1; i<64; i*=2)
{
    intlist.push_back(i);
    intlist2.push_back(i);
}
3*log2(64)+4
8

7
7

3*log2(64)+1

7

6
6

Logarithmic

O(log n)

for (int i=0; i*i<64; ++i)
{
    intlist.push_back(i);
    intlist2.push_back(i);
}
3*sqrt(64)+1
9

8
8
square root

O(sqrt(n))

for (int i=0; i<n; ++i)
{
    intlist.push_back(i);
    intlist2.push_back(i);
}
3n+1
(n+1)

n
n
Linear

O(n)

for (int i=0; i<n; ++i)
{
    for (int j=0; j<n; ++j)
    {
        int_map[i]=i+j;
    }
}
2n^2+2n+1
n+1

n*(n+1)

n*n


Polynomial

O(n^2)
fun(4);
void fun (int n)
{
    if (n > 0)
    {
        fun(n-1);
        fun(n-1);
    }
}
2^5-1


31

15
15

Exponential

O(2^n)
string s("abc");
permute(s);
void permute(string& s, int idx=0)
{
    static int len = s.size()-1;
    if (idx == len) 
    {
        return;
    }

    for (int i = idx; i < s.size(); i++) 
    {
        swap(s[idx], s[i]);
        permute(s, idx + 1);
        swap(s[idx], s[i]);
    }
}

6*3!+17



10

6

10

9
9
9


Factorial

O(n!)

In case of recursive algorithms, the above method can be overwhelming. In such cases number of calls generated or number of comparisons or number of swaps can be substituted to compute time complexity.

constant O(1) 
An algorithm is considered as constant time when it takes a constant time to execute irrespective of input. 
Consider the table below.
The assignment statements are considered to take  O(1). 
The function call at() take  O(1) as it can be assumed that time taken to assign a value to an element in a 10 item array or vector is same as 1000 item array or vector.

In all these cases, input size has no effect and therefore constant time complexity can be represented as O(1).

logarithmic O(log n)
The time taken to execute an logarithmic algorithms for an input n  is comparable to log2(n) in the worst case.
Binary search algorithm takes O(log n) as shown in the tree diagram below. The data set contains integers [20,30,40,50,80,90,100]. To search value is 40 takes takes 3 comparisons, [0,6], [0,2] and [2,2]  


Consider the following code that performs binary search. There are two for loops. The first loop creates a vector of sizes 7,14, and 21 and populates them with serial numbers. The second loop performs the binary search of a number not in range thus triggering worst case scenario.

vector<string> comps;
int bs(int *data, int val, int low, int high)
{
    if ( low < 0 || low > high)
        return -1;
    int m = (low+high)/2;
    comps.push_back(string("[") + to_string(low) + "," + to_string(high) + "]");
    if (val == data[m])
        return m; 
    else if (val < data[m])
        return bs(data, val, low, m-1);
    else if (val > data[m])
        return bs(data, val, m+1, high);
}

for (int i = 1; i < 4; ++i)
{
    vector<int> v(7*i);
    iota(begin(v),end(v),1);
    for (auto i:v) cout << i << " "; 
    cout << endl;
    comps.clear();
    bs(v.data(), 50, 0, v.size()-1);
    sort(begin(comps),end(comps),[](const string& a, const string& b){return (a.length() < b.length());});
    for (auto i:comps) cout << i << " "; 
    cout << endl << endl;
}

The time complexity of the single call to  bs() can be considered as an amortized constant. The t number of  compares in each case can be approximated as time.

When n = 7,     3  compares are performed thus time =  3  or O(3)   or approximately   O(log 7).
When n = 14,   4  compares are performed thus time =  4  or O(4)   or approximately   O(log 14).
When n = 21,   5  compares are performed thus time =  5  or O(5)   or approximately   O(log 21).

In general Logarithmic time can be represented as O(log n).
linear O(n) 
The time taken to execute an linear algorithms for an input n  is comparable to n in the worst case.
Consider the following code that performs find operation on unsorted array. 
There are two for loops. The first loop creates a vector of sizes 7,14, and 21 and populates them with randomly ordered numbers. The second loop performs find operation of a random number on them.

vector<string> swaps;
int knt=0;
int find(int *data, int val, int low, int high)
{
    for (int i=low; i<=high; ++i)
    {
        knt++;
        swaps.push_back(string("[") + to_string(data[i]) + "," + to_string(val) +  "]");
        if (data[i] == val)
        {
            return i;
        }
    }
    return -1;
}


for (int i = 1; i < 4; ++i)
{
    knt = 0;
    vector<int> v(7*i);
    iota(begin(v),end(v),1);
    shuffle(begin(v), end(v), default_random_engine(2018));
    for (auto i:v) cout << i << " "; 
    cout << endl;
    swaps.clear();
    int n = v.size()-1;
    find(v.data(), v[n-rand()%3], 0, n);
    sort(begin(swaps),end(swaps),[](const string& a, const string& b){return (a.length() < b.length());});
    for (auto i:swaps) cout << i << " "; 
    cout << endl;
    cout << "# compares:" << knt;
    cout << endl << endl;

}

The time complexity of the single call to  find() can be considered as an amortized constant. The t number of  compares in each case can be approximated as time.

When n = 7,     6  compares are performed thus time =    6  or O(6)   or approximately   O(7).
When n = 14,  13 compares are performed thus time =  13  or O(13) or approximately O(14).
When n = 21,  21 compares are performed thus time =  21  or O(21) or approximately O(21).

In general linear time can be represented as O(n).

log linear  O(n log n)
The time taken to execute an log linear algorithms for an input n  is comparable to n * log2(n) in the worst case.
Quick sort algorithm takes O(nlog n) as shown in the tree diagram below. The data set contains integers [9,-3,5,2,6,8,-6,1,3]. To sort , 16 comparisons were performed as shown below.  
[9,3] [5,3] [2,3] [6,3] [8,3] [1,3] [2,1] [8,6] [5,6] [9,6] [9,8] [-3,3] [-6,3] [-3,1] [-6,1] [-3,-6] 


Consider the following code that performs quick sort. 
There are two for loops. The first loop creates a vector of sizes 5,10, and 15 and populates them with randomly ordered numbers. The second loop performs partition sort on them.
vector<string> comps;
int knt = 0, knt2=0;
void swap(int &l, int &r)
{
    ++knt2;
    swap<int>(l, r);
}

int  partition( int *data, int low, int high)
{
    int ltpvt = low - 1;
    for (int i=low; i<high; ++i)
    {
        ++knt;
        comps.push_back(string("[") + to_string(data[i]) + "," + to_string(data[high])  + "]");
        if (data[i] < data[high])
        {
            
            ++ltpvt;
            if (i != ltpvt)
                swap(data[i], data[ltpvt]);
        }
    }
    ++ltpvt;
    if (high != ltpvt)
        swap(data[high], data[ltpvt]);
    return ltpvt;
}

void qs(int *data, int low, int high, int n)
{
    if (low >= high) 
        return;

    int pi = partition(data,low,high);
    qs(data,low, pi-1,n);
    qs(data,pi+1,high,n);
}

    for (int i=1; i<4; ++i)
    {
        vector<int>  v(5*i);
        iota(begin(v),end(v),1);
        shuffle(begin(v), end(v), default_random_engine(2018));

        knt = 0;
        knt2 = 0;
        for (auto i:v) cout << i << " "; cout << endl;
        comps.clear();
        qs(v.data(),0,v.size()-1,v.size());
        sort(begin(comps),end(comps),[](const string& a, const string& b){return (a.length() < b.length());});
        for (auto i:comps) cout << i << " "; cout << endl;
        for (auto i:v) cout << i << " "; cout << endl;
        cout << "# swaps:" << knt2 << endl;
        cout << "# compares:" << knt << endl;
        cout << endl;
    }

The time complexity of the single call to  qs() can be considered as an amortized constant. The t number of  compares in each case can be approximated as time.

When n = 5,     10 compares are performed thus time =  10  or O(10) or approximately O(5log 5).
When n = 10,   32 compares are performed thus time =  32  or O(32) or approximately O(10log10).
When n = 15,   41 compares are performed thus time =  41  or O(41) or approximately O(15log15).

In general log linear time can be represented as O(nlog n).

polynominal  O(n^2)
The time taken to execute a polynomial algorithms for an input n  is comparable to n^ 2 in the worst case.
Consider the following code that performs selection sort. 
There are two for loops. The first loop creates a vector of sizes 5,10, and 15 and populates them with randomly ordered numbers. The second loop performs partition sort on them.
vector<string> comps;
int knt = 0, knt2=0;
void swap(int &l, int &r)
{
    ++knt2;
    swap<int>(l, r);
}

void ss(int *data, int low, int high, int n)
{
    for (int i=low; i < high; ++i)
    {
        int k=i;
        for (int j=i+1; j <= high; ++j)
        {
            ++knt;
            comps.push_back(string("[")+to_string(data[k])+","+to_string(data[j])+"]");
            if (data[k] > data[j])
            {
                k = j;
            }
                
        }   
        if (k != i)
            swap(data[i], data[k]);
    }
}
    for (int i=1; i<4; ++i)
    {
        vector<int>  v(5*i);
        iota(rbegin(v),rend(v),1);
        shuffle(begin(v), end(v), default_random_engine(2018));
knt = 0; knt2 = 0; for (auto i:v) cout << i << " "; cout << endl; comps.clear(); ss(v.data(),0,v.size()-1,v.size()); sort(begin(comps),end(comps),[](const string& a, const string& b){return (a.length() < b.length());}); for (auto i:comps) cout << i << " "; cout << endl; for (auto i:v) cout << i << " "; cout << endl; cout << "# compares:" << knt << endl; cout << "# swaps:" << knt2 << endl; cout << endl; }

The time complexity of the single call to  ss() can be considered as an amortized constant. The t number of  compares in each case can be approximated as time.

When n = 5,      10 compares are performed thus time =   10  or  O(10)  or approximately O(5^2).
When n = 10,    45 compares are performed thus time =   45  or  O(45)  or approximately O(10^2).
When n = 15,  105 compares are performed thus time =  105 or  O(105) or approximately O(15^2).
In general polynomial time can be represented as O(n^2).
Exponential O(k^n)
The time taken to execute a exponential algorithms for an input n  is comparable to k^ n in the worst case where k can be 2, 3 etc.
For example, Backtracking algorithm to find subsets takes O(k^n).
In the example below, the data set contains integers [1,2,3]. The algorithm produces  8 subsets as shown below. [] [3] [2] [1] [23] [13] [12] [123]. Its time complexity can be calculated as 2^3. 


Consider the following code that prints all possible subsets of a set. 
There are two for loops. The first loop creates a vector of sizes 4,5,6 and populates them with randomly ordered characters. The second loop finds the subsets and prints them.
vector<string> subsets;
int knt = 0;
void FindSubsets(const string& arr, int index, string& currentSubset) 
{

    if (index >= arr.size()) 
    {
        knt++;
        subsets.push_back("[" + currentSubset + "] ");
        return;
    }

    string temp(currentSubset);
    FindSubsets(arr, index + 1, temp);

    currentSubset.push_back(arr[index]);
    FindSubsets(arr, index + 1, currentSubset);
}


    for (int i=1; i<4; ++i)
    {
        string currentSubset;
        string  v(3+i,' ');
        iota(begin(v),end(v),'a');
        shuffle(begin(v), end(v), default_random_engine(2018));
        
        knt = 0;
        for (auto i:v) cout << i << " "; cout << endl;
        subsets.clear();
        FindSubsets(v,0,currentSubset);
        sort(begin(subsets), end(subsets),[](string& a, string& b){return a.size() < b.size();});
        for (auto i:subsets) cout << i;
        cout << endl;
        cout << "# subsets:" << knt << endl;
        cout << endl;
    } 

The time complexity of the single call to  FindSubsets() can be considered as an amortized constant. The t number of  subsets generated in each case can be approximated as time.

When n = 4,      16 subsets are generated  thus time  =   16  or  O(16)   or approximately O(2^4).
When n = 5,      32 subsets are generated  thus time  =   32  or  O(32)   or approximately O(2^5).
When n = 6,      64 subsets are generated  thus time  =   64 or  O(64)    or approximately O(2^6).

In general Exponential time can be represented as O(k^n) where k=2.
Factorial O(n!)
The time taken to execute a exponential algorithms for an input n  is comparable to n! in the worst case.
For example, Backtracking algorithm to find permutations takes O(n!).
In the example below, the data set contains letters  [A,B,C]. The algorithm produces  6 permutations  as shown below. [ABC] [ACB] [BAC] [BCA] [CBA] [CAB]. Its time complexity can be calculated as 3!. 


Consider the following code that prints all possible permutations of a set. 
There are two for loops. The first loop creates a vector of sizes 3,4,5 and populates them with randomly ordered characters. The second loop finds permutations and prints them.
vector<string> permutations;
int knt = 0;
void FindPermutatons(const string& arr, int index) 
{
    if (index >= arr.size()) 
    {
        knt++;
        permutations.push_back("[" + arr + "] ");
        return;
    }

    for (int i=index; i < arr.size(); ++i)
    {
        string temp(arr);
        swap(temp[i],temp[index]);
        FindPermutatons(temp, index + 1);
    }
}

    for (int i=1; i<4; ++i)
    {
        string  v(2+i,' ');
        iota(begin(v),end(v),'a');
        
        knt = 0;
        for (auto i:v) cout << i << " "; cout << endl;
        permutations.clear();
        FindPermutatons(v,0);
        sort(begin(permutations), end(permutations),[](string& a, string& b){return a.size() < b.size();});
        for (auto i:permutations) cout << i;
        cout << endl;
        cout << "# Permutatons:" << knt << endl;
        cout << endl;
    } 

The time complexity of the single call to  FindPermutatons() can be considered as an amortized constant. The t number of  permutations generated in each case can be approximated as time.

When n = 3,      6  subsets are generated  thus time  =   6     or  O(6)    or approximately O(3!).
When n = 4     24  subsets are generated  thus time  =   24   or  O(24)  or approximately O(4!).
When n = 5,    120 subsets are generated  thus time  =  120 or  O(120) or approximately O(5!).

In general factorial time can be represented as O(n!).



Monday, October 7, 2024

uninitialized memory

Overview
uninitialized memory algorithms enable populating objects in uninitialized memory.  They construct the objects in-place, This allows to obtain fully constructed copies of the elements into a range of uninitialized memory, such as a memory block obtained by a call to malloc.

Details
NameDescription
ForwardIterator uninitialized_copy (
InputIterator first, InputIterator last, 
ForwardIterator target)
Copies and constructs the elements in the range [first,last]   into the uninitialized memory target.

Example
const char *numbers[] = {"one", "two", "three", "four"};
size_t sz = sizeof(numbers)/sizeof(const char *);
char *buf = new char[sizeof(string)*sz];
string *sbuf = reinterpret_cast<string*>(buf);

auto last = std::uninitialized_copy ( numbers, numbers+sz, sbuf );

//prints one two three four 
for (auto itr = sbuf; itr != last; ++itr)
    std::cout << *itr  << " ";

for (auto itr = sbuf; itr != last; ++itr)
    itr->~string();
delete[] buf;
ForwardIterator uninitialized_copy_n (
InputIterator 
first, 
size_t n, ForwardIterator result)
Same as uninitialized_copy except n number of elements are copied.

Examples:
const char *numbers[] = {"one", "two", "three", "four"};
size_t sz = sizeof(numbers)/sizeof(const char *);
char *buf = new char[sizeof(string)*sz];
string *sbuf = reinterpret_cast<string*>(buf);

auto last = std::uninitialized_copy_n ( numbers, sz, sbuf );

//prints one two three four 
for (auto itr = sbuf; itr != last; ++itr)
    std::cout << *itr  << " ";

for (auto itr = sbuf; itr != last; ++itr)
    itr->~string();
delete[] buf;
ForwardIterator uninitialized_fill (
InputIterator 
first, InputIterator last, 
const T& x)     
Constructs the elements into the uninitialized memory range [first,last]   and populates them using x.

Examples:
size_t sz = 4;
char *buf = new char[sizeof(string)*sz];
string *sbuf = reinterpret_cast<string*>(buf);

std::uninitialized_fill ( sbuf, sbuf+sz, "veda");

//prints veda veda veda veda  
for (auto itr = sbuf; itr != (sbuf+sz); ++itr)
    std::cout << *itr  << " ";

for (auto itr = sbuf; itr != (sbuf+sz); ++itr)
    itr->~string();
delete[] buf;
ForwardIterator uninitialized_fill_n (
InputIterator 
first, 
size_t n, const T& x)
Same as uninitialized_fill except n number of elements are filled.

Examples:
size_t sz = 4;
char *buf = new char[sizeof(string)*sz];
string *sbuf = reinterpret_cast<string*>(buf);

std::uninitialized_fill_n ( sbuf, sz, "veda");

//prints veda veda veda veda  
for (auto itr = sbuf; itr != (sbuf+sz); ++itr)
    std::cout << *itr  << " ";

for (auto itr = sbuf; itr != (sbuf+sz); ++itr)
    itr->~string();
delete[] buf;

The example depicts the usage.

Summary of Examples
NameDescriptiongithubwandbox
  Example       Usage   source    output      source + output   




Monday, December 18, 2023

Algorithms - Partitions

Overview
Partitions are commonly used for searching and sorting. Partitioning algorithms  operate on various types of containers such as sequence, associative and unordered to perform actions such as partition, sort and more. 
These function accepts parameters like 
  • iterators such as forward, bidirectional, input, output 
  • predicates such as unary
The parameter type  UnaryPredicate  is a function that accepts a single input and returns bool value (true or false). 
For example, bool isodd(int i) {return (i % 2) == 1;} or [](int i) {return (i % 2) == 1;}

Details
NameDescription
ForwardIterator partition (
ForwardIterator 
first, ForwardIterator last, 
UnaryPredicate 
pred)
Partition the elements in the range [first,last] into two groups such that all the elements for which pred returns true precede all those for which it returns false. 
The iterator returned points to the first element of the second group.

Example:
vector<int> v{3,2,6,5,4,1};

//v = {3,1,5,6,4,2}
//itr = begin(v)+3
auto itr = partition(begin(v), end(v), 
[](int n) {return ((n % 2) == 1); });
BidirectionalIterator stable_partition (
BidirectionalIterator 
first, BidirectionalIterator last, 
UnaryPredicate pred)
Same as partition except the relative order of elements within each group is preserved.

Example:
vector<int> v{3,2,6,5,4,1};

//v = {3,5,1,2,6,4}
//itr = begin(v)+3
auto itr = stable_partition(begin(v), end(v), 
[](int n) {return ((n % 2) == 1); });
pair<OutputIterator,OutputIterator>partition_copy(
InputIterator first, InputIterator last, 
OutputIterator 
itrtrue, OutputIterator itrfalse, 
UnaryPredicate 
pred)     
Copies the elements in the range [first,last] for which pred returns true into the range pointed by itrtrue, and those for which it does not into the range pointed by itrfalse. 
Returns a pair of iterators with the end of the generated sequences pointed by itrtrue and itrfalse, respectively.

Example:
vector<int> v{3,2,6,5,4,1},vodd(3),veven(3);

//p:{end(vodd),end(veven)}
//vodd = {3,5,1}
//veven = {2,6,4}
auto p = partition_copy(begin(v), end(v), begin(vodd),begin(veven),
    [](int n) {return ((n % 2) == 1); });
ForwardIterator partition_point (ForwardIterator first, ForwardIterator last, 
UnaryPredicate 
pred)
Returns an iterator to the first element in the partitioned range [first,last] for which pred is not true, indicating its partition point.
Note that the range should be partitioned, otherwise end() is returned.

Example:
vector<int> v{1,2,3,4,5};

//itr = begin(v)+2
auto itr = partition_point(begin(v), end(v), 
    [](int n) {return (n < 3); });
bool is_partitioned (
InputIterator first, InputIterator last, 
UnaryPredicate 
pred)
Test whether range is partitioned. 
Returns true if all the elements in the range [first,last] for which pred returns true precede those for which it returns false.

Examples:
vector<int> v{1,2,3,4,5}, v2 {4,3,1,5,2};

//b = true
bool b = is_partitioned(begin(v)+3, end(v), 
    [](int n) {return (n < 3); });

//b = false
b = is_partitioned(begin(v2)+3, end(v2), 
    [](int n) {return (n < 3); });

The example depicts the usage.

Summary of Examples
NameDescriptiongithubwandbox
  Example       Usage   source    output      source + output