Recent

Author Topic: Calculation of the pI of proteins according to Skoog and Wichman  (Read 1386 times)

s_bue

  • New member
  • *
  • Posts: 6
I wanted to try a script by Skoog and Wichman to calculate the pI of proteins. Unfortunately I'm not good at programming (had just a basic fpc course 10 years back in school). From all I remembered and logical thinking I thought I could fix the script but if I run it, it still doesn't work properly. If I run it with Lazarus it closes the script without giving a result. If I run it with the https://www.onlinegdb.com/online_pascal_compiler it sometimes calculates pI's but is far off the results I get from https://web.expasy.org/cgi-bin/compute_pi/pi_tool .
For example imagine a protein consisting out of all the charged amino acids arg, asp, cys, glu, his, lys, tyr all of them only once and the total peptide consisting only out of these 7 chains it calculates a pI of 1.967. If I enter the order RDCEHKY (short forms of the amino acid short forms  :D) in expasy it comes to a pI of 6.74.
It would be so nice if somebody could check the code and probably tell me if there is some major mistake or what might be the problem or even fix it  ;D O:-)
If sb. needs the paper from Skoog and Wichman let me know :) !

PROGRAM Iso_point;
CONST    pKa3_arg=12.5; pKa3_asp=3.9; pKa3_cys=8.3;
   pKa3_glu=4.3; pKa3_his=6.0; pKa3_lys=10.5;
   pKa3_tyr=10.1;
   Max_chains=5;
   Max_iterations=100;
   pH1=1;   pH2=14;   Accuracy=0.0005;
   All = 'alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval';
TYPE   Str3 = String[3];
   Str80 = String[80];
   Aminoacid = ARRAY [1..Max_chains] OF REAL;
   Turn = ARRAY [1..Max_iterations] OF REAL;
   N_values = ARRAY [1..20] OF REAL;
   C_values = ARRAY [1..20] OF REAL;
VAR   Name : Str80;
   Test : Str3;
   Num_arg, Num_asp, Num_cys,
   Num_glu, Num_his, Num_lys,
   Num_tyr, Num_chains : INTEGER;
   N_term,
   C_term : Aminoacid;
   Test_pH,
   Charge : Turn;
   TermN, TermC : REAL;
   Number, N : INTEGER;
   NtermpK : N_values;
   CtermpK : C_values;
   Sum,Temp : REAL;
FUNCTION Neg_charge (pK_value : REAL;
         pH : Turn) : REAL;
BEGIN
Neg_charge:=(exp(2.3026*((pH[N])-(pK_value))))/(1+exp(2.3026*((pH[N])-(pK_value))));
END; {Neg_charge}
FUNCTION Pos_charge (pK_value : REAL;
         pH : Turn) : REAL;
BEGIN
   Pos_charge:=(1/(exp(2.3026*(((pH[N])-(pK_value))))+1));
END; {Pos_charge}
PROCEDURE Findend;
BEGIN
NtermpK[1]:=9.9; NtermpK[2]:=9.0; NtermpK[3]:=8.8; NtermpK[4]:=9.8;
   NtermpK[5]:=10.8; NtermpK[6]:=9.1; NtermpK[7]:=9.7; NtermpK[8]:=9.8;
   NtermpK[9]:=9.2; NtermpK[10]:=9.8; NtermpK[11]:=9.7; NtermpK[12]:=9.0;
   NtermpK[13]:=9.3; NtermpK[14]:=9.2; NtermpK[15]:=10.6; NtermpK[16]:=9.2;
   NtermpK[17]:=9.1; NtermpK[18]:=9.4; NtermpK[19]:=9.1; NtermpK[20]:=9.7;
   CtermpK[1]:=2.4; CtermpK[2]:=2.2; CtermpK[3]:=2.1; CtermpK[4]:=2.1;
   CtermpK[5]:=1.7; CtermpK[6]:=2.2; CtermpK[7]:=2.2; CtermpK[8]:=2.4;
   CtermpK[9]:=1.8; CtermpK[10]:=2.3; CtermpK[11]:=2.3; CtermpK[12]:=2.2;
   CtermpK[13]:=2.1; CtermpK[14]:=2.2; CtermpK[15]:=2.0; CtermpK[16]:=2.2;
   CtermpK[17]:=2.1; CtermpK[18]:=2.4; CtermpK[19]:=2.2; CtermpK[20]:=2.3;
Number:=Pos(Test,All);
   Number:=(Number+2) DIV 3;
   TermN:=NtermpK[Number];
   TermC:=CtermpK[Number];
    END; {Findend}
PROCEDURE Check (VAR Instr : Str3);
VAR Count : INTEGER;
BEGIN
   FOR count:= 1 TO 3 DO IF Instr[count] IN ['A'..'Z'] THEN
   Instr[count] := CHR(ORD(Instr[count])+32);
END; {Check}
PROCEDURE Input;
VAR n : 1..Max_chains;
BEGIN
   Write('Name of protein/peptide: ');
   Readln(Name);
   Writeln;
   Writeln('Please, enter the number of charged amino acids');
   Write('arg: '); Readln(Num_arg);
   Write('asp: '); Readln(Num_asp);
   Write('cys: '); Readln(Num_cys);
   Write('glu: '); Readln(Num_glu);
   Write('his: '); Readln(Num_his);
   Write('lys: '); Readln(Num_lys);
   Write('tyr: '); Readln(Num_tyr);
   Writeln;
   Write('Number of polypeptide chains: ');
   Readln(Num_chains);
   FOR n:=1 TO Num_chains DO
   BEGIN
      Write('Chain',n:2,' N-terminal amino acid: ');
      Readln(Test);
      Check(Test);
      Findend;
      C_term[n]:=TermC;
   END;
END; {Input}

Procedure Find_charge;
VAR I : 1..Max_chains;
BEGIN
Sum:=0;
Sum:=Sum-(Num_asp)*(Neg_charge(pKa3_asp,Test_pH));
Sum:=Sum-(Num_glu)*(Neg_charge(pKa3_glu,Test_pH));
Sum:=Sum-(Num_cys)*(Neg_charge(pKa3_cys,Test_pH));
Sum:=Sum-(Num_tyr)*(Neg_charge(pKa3_tyr,Test_pH));
Sum:=Sum+(Num_his)*(Pos_charge(pKa3_his,Test_pH));
Sum:=Sum+(Num_lys)*(Pos_charge(pKa3_lys,Test_pH));
Sum:=Sum+(Num_arg)*(Pos_charge(pKa3_arg,Test_pH));
FOR i:=1 to Num_chains DO
BEGIN
   Temp:=(Pos_charge(N_term,Test_pH));
      Sum:=Sum+Temp;
      Temp:=(Neg_charge(C_term,Test_pH));
      Sum:=Sum-Temp;
      Temp:=Sum;
END;
Charge[N]:=Temp;
END; {find_charge}
PROCEDURE Iterate;
VAR Factor, Diff : REAL;
BEGIN
N:=1; Test_pH[N]:=pH1; Find_charge;
N:=2; Test_pH[N]:=pH2; Find_charge;
REPEAT
N:=N+1;
BEGIN
 Factor:=((Test_pH[N-1])-(Test_pH[N-2]))/((Charge[N-1])-(Charge[N-2]));
 Test_pH[N]:=(Test_pH[N-1])-(Charge[N-1]*Factor);
 Find_charge;
 Temp:=(Charge[N])-(Charge[N-1]);
 Diff:=ABS(Temp);
END;
   UNTIL (Diff)<=(Accuracy);
   Writeln;
   Writeln('Isoelectric point calculated to pH= ',Test_pH[N]:5:3);
   Writeln;
   Writeln('Accuracy= ',Accuracy:7:5,' pH units');
   Writeln('Number of iterations= ',N);
END; {Iterate}

BEGIN {main}
   Input;
   Iterate;
END.

Bart

  • Hero Member
  • *****
  • Posts: 3428
    • Bart en Mariska's Webstek
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #1 on: April 25, 2019, 06:04:35 pm »
Please put your code inside [ code ] [ /code ] tages (without the spaces, they are there to fool the forum software).
Now part of the code is in Italic, problay because you used [ i ] as an index (again, without the spaces), which the forum software treats as "start italic".

Bart

lucamar

  • Hero Member
  • *****
  • Posts: 1483
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #2 on: April 25, 2019, 06:43:14 pm »
[...] without the spaces, they are there to fool the forum software [...]

You don't need to do that; instead use:
  [nobbc][code][/code][/nobbc]

More work for you but clearer for the (new) readers :)
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus 1.8.4 & 2.0.2 w/FPC 3.0.4 on:
(K|L)Ubuntu 12..16, Windows XP SP3, various DOSes.

Bart

  • Hero Member
  • *****
  • Posts: 3428
    • Bart en Mariska's Webstek
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #3 on: April 25, 2019, 07:22:44 pm »
You don't need to do that; instead use:
  [nobbc][code][/code][/nobbc]

Never knew that.
I made a note of this for my self for future reference.

Thanks!

Bart

Bart

  • Hero Member
  • *****
  • Posts: 3428
    • Bart en Mariska's Webstek
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #4 on: April 25, 2019, 07:27:08 pm »
If sb. needs the paper from Skoog and Wichman let me know :) !

How on earth are we able to comment on your code, if we do not know the correct algorithm at all?

Bart

Bart

  • Hero Member
  • *****
  • Posts: 3428
    • Bart en Mariska's Webstek
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #5 on: April 25, 2019, 07:54:08 pm »
I compiled your code:

Quote
Compile Project, Mode: Debug, Target: isopoint.exe: Success, Warnings: 1, Hints: 1
isopoint.lpr(20,2) Warning: Variable "N_term" read but nowhere assigned
isopoint.lpr(24,2) Note: Local variable "TermN" is assigned but never used

Maybe fix those warnings first?

Bart

lucamar

  • Hero Member
  • *****
  • Posts: 1483
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #6 on: April 25, 2019, 08:22:51 pm »
You don't need to do that; instead use:
  [nobbc][code][/code][/nobbc]
Never knew that.
I made a note of this for my self for future reference.

Thanks!

Bart

It pays to read the docs :)
https://wiki.simplemachines.org/smf/Bulletin_board_code
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus 1.8.4 & 2.0.2 w/FPC 3.0.4 on:
(K|L)Ubuntu 12..16, Windows XP SP3, various DOSes.

s_bue

  • New member
  • *
  • Posts: 6
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #7 on: April 26, 2019, 01:46:28 am »
Sorry, it was my first post in this forum wasn't aware about how to post stuff properly :(
Please put your code inside [ code ] [ /code ] tages (without the spaces, they are there to fool the forum software).
Now part of the code is in Italic, problay because you used [ i ] as an index (again, without the spaces), which the forum software treats as "start italic".



Bart

Code: [Select]
PROGRAM Iso_point;
CONST pKa3_arg=12.5; pKa3_asp=3.9; pKa3_cys=8.3;
pKa3_glu=4.3; pKa3_his=6.0; pKa3_lys=10.5;
pKa3_tyr=10.1;
Max_chains=5;
Max_iterations=100;
pH1=1; pH2=14; Accuracy=0.0005;
All = 'alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval';
TYPE Str3 = String[3];
Str80 = String[80];
Aminoacid = ARRAY [1..Max_chains] OF REAL;
Turn = ARRAY [1..Max_iterations] OF REAL;
N_values = ARRAY [1..20] OF REAL;
C_values = ARRAY [1..20] OF REAL;
VAR Name : Str80;
Test : Str3;
Num_arg, Num_asp, Num_cys,
Num_glu, Num_his, Num_lys,
Num_tyr, Num_chains : INTEGER;
N_term,
C_term : Aminoacid;
Test_pH,
Charge : Turn;
TermN, TermC : REAL;
Number, N : INTEGER;
NtermpK : N_values;
CtermpK : C_values;
Sum,Temp : REAL;
FUNCTION Neg_charge (pK_value : REAL;
pH : Turn) : REAL;
BEGIN
Neg_charge:=(exp(2.3026*((pH[N])-(pK_value))))/(1+exp(2.3026*((pH[N])-(pK_value))));
END; {Neg_charge}
FUNCTION Pos_charge (pK_value : REAL;
pH : Turn) : REAL;
BEGIN
Pos_charge:=(1/(exp(2.3026*(((pH[N])-(pK_value))))+1));
END; {Pos_charge}
PROCEDURE Findend;
BEGIN
NtermpK[1]:=9.9; NtermpK[2]:=9.0; NtermpK[3]:=8.8; NtermpK[4]:=9.8;
NtermpK[5]:=10.8; NtermpK[6]:=9.1; NtermpK[7]:=9.7; NtermpK[8]:=9.8;
NtermpK[9]:=9.2; NtermpK[10]:=9.8; NtermpK[11]:=9.7; NtermpK[12]:=9.0;
NtermpK[13]:=9.3; NtermpK[14]:=9.2; NtermpK[15]:=10.6; NtermpK[16]:=9.2;
NtermpK[17]:=9.1; NtermpK[18]:=9.4; NtermpK[19]:=9.1; NtermpK[20]:=9.7;
CtermpK[1]:=2.4; CtermpK[2]:=2.2; CtermpK[3]:=2.1; CtermpK[4]:=2.1;
CtermpK[5]:=1.7; CtermpK[6]:=2.2; CtermpK[7]:=2.2; CtermpK[8]:=2.4;
CtermpK[9]:=1.8; CtermpK[10]:=2.3; CtermpK[11]:=2.3; CtermpK[12]:=2.2;
CtermpK[13]:=2.1; CtermpK[14]:=2.2; CtermpK[15]:=2.0; CtermpK[16]:=2.2;
CtermpK[17]:=2.1; CtermpK[18]:=2.4; CtermpK[19]:=2.2; CtermpK[20]:=2.3;
Number:=Pos(Test,All);
Number:=(Number+2) DIV 3;
TermN:=NtermpK[Number];
TermC:=CtermpK[Number];
    END; {Findend}
PROCEDURE Check (VAR Instr : Str3);
VAR Count : INTEGER;
BEGIN
FOR count:= 1 TO 3 DO IF Instr[count] IN ['A'..'Z'] THEN
Instr[count] := CHR(ORD(Instr[count])+32);
END; {Check}
PROCEDURE Input;
VAR n : 1..Max_chains;
BEGIN
Write('Name of protein/peptide: ');
Readln(Name);
Writeln;
Writeln('Please, enter the number of charged amino acids');
Write('arg: '); Readln(Num_arg);
Write('asp: '); Readln(Num_asp);
Write('cys: '); Readln(Num_cys);
Write('glu: '); Readln(Num_glu);
Write('his: '); Readln(Num_his);
Write('lys: '); Readln(Num_lys);
Write('tyr: '); Readln(Num_tyr);
Writeln;
Write('Number of polypeptide chains: ');
Readln(Num_chains);
FOR n:=1 TO Num_chains DO
BEGIN
Write('Chain',n:2,' N-terminal amino acid: ');
Readln(Test);
Check(Test);
Findend;
C_term[n]:=TermC;
END;
END; {Input}

Procedure Find_charge;
VAR I : 1..Max_chains;
BEGIN
Sum:=0;
Sum:=Sum-(Num_asp)*(Neg_charge(pKa3_asp,Test_pH));
Sum:=Sum-(Num_glu)*(Neg_charge(pKa3_glu,Test_pH));
Sum:=Sum-(Num_cys)*(Neg_charge(pKa3_cys,Test_pH));
Sum:=Sum-(Num_tyr)*(Neg_charge(pKa3_tyr,Test_pH));
Sum:=Sum+(Num_his)*(Pos_charge(pKa3_his,Test_pH));
Sum:=Sum+(Num_lys)*(Pos_charge(pKa3_lys,Test_pH));
Sum:=Sum+(Num_arg)*(Pos_charge(pKa3_arg,Test_pH));
FOR i:=1 to Num_chains DO
BEGIN
Temp:=(Pos_charge(N_term[i],Test_pH));
Sum:=Sum+Temp;
Temp:=(Neg_charge(C_term[i],Test_pH));
Sum:=Sum-Temp;
Temp:=Sum;
END;
Charge[N]:=Temp;
END; {find_charge}
PROCEDURE Iterate;
VAR Factor, Diff : REAL;
BEGIN
N:=1; Test_pH[N]:=pH1; Find_charge;
N:=2; Test_pH[N]:=pH2; Find_charge;
REPEAT
N:=N+1;
BEGIN
 Factor:=((Test_pH[N-1])-(Test_pH[N-2]))/((Charge[N-1])-(Charge[N-2]));
 Test_pH[N]:=(Test_pH[N-1])-(Charge[N-1]*Factor);
 Find_charge;
 Temp:=(Charge[N])-(Charge[N-1]);
 Diff:=ABS(Temp);
END;
UNTIL (Diff)<=(Accuracy);
Writeln;
Writeln('Isoelectric point calculated to pH= ',Test_pH[N]:5:3);
Writeln;
Writeln('Accuracy= ',Accuracy:7:5,' pH units');
Writeln('Number of iterations= ',N);
END; {Iterate}

BEGIN {main}
Input;
Iterate;
END.



s_bue

  • New member
  • *
  • Posts: 6
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #8 on: April 26, 2019, 01:51:30 am »
I compiled your code:

Quote
Compile Project, Mode: Debug, Target: isopoint.exe: Success, Warnings: 1, Hints: 1
isopoint.lpr(20,2) Warning: Variable "N_term" read but nowhere assigned
isopoint.lpr(24,2) Note: Local variable "TermN" is assigned but never used

Maybe fix those warnings first?

Bart

I think maybe N_term is TermN ... might make sense I'll change this and make it all to N_term :) 

JLWest

  • Hero Member
  • *****
  • Posts: 519
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #9 on: April 26, 2019, 01:59:32 am »
Wow - Well this looks impossible so you brain-e-acks should have the answer to this one in 20 min.

@Bart I ask question all the time without a clue about the   algo-whatever.

My answer to this problem would be 'No!

I gave up at the line: 'All = 'alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval';

Hope this isn't used like:
 if  alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval = -1 then exit; end;

I do check in on post and read code to try to get better.
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
500 GB SSD
3 Terabyte conventional disk space.

VTwin

  • Hero Member
  • *****
  • Posts: 655
  • Former Turbo Pascal 3 user
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #10 on: April 26, 2019, 03:28:52 am »
I gave up at the line: 'All = 'alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval';

Those are all the amino acids, alanine to valine.

@s_bue
Please post a link to the Skoog and Wichman paper. I did a library search without success. Also, do you have an example of input with correct output? It is not clear to me from your post, and the link you give does not work.
« Last Edit: April 26, 2019, 04:37:13 am by VTwin »
“Talk is cheap. Show me the code.” -Linus Torvalds

macOS 10.11.6: Lazarus 2.1.0 svn 61174M (64 bit Cocoa trunk)
Ubuntu 18.04.2: Lazarus 2.0.0 (64 bit on VBox)
Windows 7 Pro SP1: Lazarus 2.0.0 (64 bit on VBox)

lucamar

  • Hero Member
  • *****
  • Posts: 1483
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #11 on: April 26, 2019, 05:03:45 am »
@s_bue
Please post a link to the Skoog and Wichman paper.

He did better: attached it to his post

Although it's a little hard to read the code ...
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus 1.8.4 & 2.0.2 w/FPC 3.0.4 on:
(K|L)Ubuntu 12..16, Windows XP SP3, various DOSes.

Lutz Mändle

  • New member
  • *
  • Posts: 49
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #12 on: April 26, 2019, 06:43:54 am »
@s_bue
You forgot some lines in the for loop of the input procedure.

The complete for loop should go as this:

Code: Pascal  [Select]
  1.   FOR n:=1 TO Num_chains DO
  2.   BEGIN
  3.     Write('Chain',n:2,' N-terminal amino acid: ');
  4.     Readln(Test);
  5.     Check(Test);
  6.     Findend;
  7.     N_term[n]:=TermN;
  8.     Write('Chain',n:2,' C-terminal amino acid: ');
  9.     Readln(Test);
  10.     Check(Test);
  11.     Findend;
  12.     C_term[n]:=TermC;
  13.   END;
  14.  

valdir.marcos

  • Hero Member
  • *****
  • Posts: 711
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #13 on: April 26, 2019, 07:37:58 am »
I wanted to try a script by Skoog and Wichman to calculate the pI of proteins. Unfortunately I'm not good at programming (had just a basic fpc course 10 years back in school). From all I remembered and logical thinking I thought I could fix the script but if I run it, it still doesn't work properly. If I run it with Lazarus it closes the script without giving a result. If I run it with the https://www.onlinegdb.com/online_pascal_compiler it sometimes calculates pI's but is far off the results I get from https://web.expasy.org/cgi-bin/compute_pi/pi_tool .
For example imagine a protein consisting out of all the charged amino acids arg, asp, cys, glu, his, lys, tyr all of them only once and the total peptide consisting only out of these 7 chains it calculates a pI of 1.967. If I enter the order RDCEHKY (short forms of the amino acid short forms  :D) in expasy it comes to a pI of 6.74.
It would be so nice if somebody could check the code and probably tell me if there is some major mistake or what might be the problem or even fix it  ;D O:-)
If sb. needs the paper from Skoog and Wichman let me know :) !

PROGRAM Iso_point;
CONST    pKa3_arg=12.5; pKa3_asp=3.9; pKa3_cys=8.3;
   pKa3_glu=4.3; pKa3_his=6.0; pKa3_lys=10.5;
   pKa3_tyr=10.1;
   Max_chains=5;
   Max_iterations=100;
   pH1=1;   pH2=14;   Accuracy=0.0005;
   All = 'alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval';
TYPE   Str3 = String[3];
   Str80 = String[80];
   Aminoacid = ARRAY [1..Max_chains] OF REAL;
   Turn = ARRAY [1..Max_iterations] OF REAL;
   N_values = ARRAY [1..20] OF REAL;
   C_values = ARRAY [1..20] OF REAL;
VAR   Name : Str80;
   Test : Str3;
   Num_arg, Num_asp, Num_cys,
   Num_glu, Num_his, Num_lys,
   Num_tyr, Num_chains : INTEGER;
   N_term,
   C_term : Aminoacid;
   Test_pH,
   Charge : Turn;
   TermN, TermC : REAL;
   Number, N : INTEGER;
   NtermpK : N_values;
   CtermpK : C_values;
   Sum,Temp : REAL;
FUNCTION Neg_charge (pK_value : REAL;
         pH : Turn) : REAL;
BEGIN
Neg_charge:=(exp(2.3026*((pH[N])-(pK_value))))/(1+exp(2.3026*((pH[N])-(pK_value))));
END; {Neg_charge}
FUNCTION Pos_charge (pK_value : REAL;
         pH : Turn) : REAL;
BEGIN
   Pos_charge:=(1/(exp(2.3026*(((pH[N])-(pK_value))))+1));
END; {Pos_charge}
PROCEDURE Findend;
BEGIN
NtermpK[1]:=9.9; NtermpK[2]:=9.0; NtermpK[3]:=8.8; NtermpK[4]:=9.8;
   NtermpK[5]:=10.8; NtermpK[6]:=9.1; NtermpK[7]:=9.7; NtermpK[8]:=9.8;
   NtermpK[9]:=9.2; NtermpK[10]:=9.8; NtermpK[11]:=9.7; NtermpK[12]:=9.0;
   NtermpK[13]:=9.3; NtermpK[14]:=9.2; NtermpK[15]:=10.6; NtermpK[16]:=9.2;
   NtermpK[17]:=9.1; NtermpK[18]:=9.4; NtermpK[19]:=9.1; NtermpK[20]:=9.7;
   CtermpK[1]:=2.4; CtermpK[2]:=2.2; CtermpK[3]:=2.1; CtermpK[4]:=2.1;
   CtermpK[5]:=1.7; CtermpK[6]:=2.2; CtermpK[7]:=2.2; CtermpK[8]:=2.4;
   CtermpK[9]:=1.8; CtermpK[10]:=2.3; CtermpK[11]:=2.3; CtermpK[12]:=2.2;
   CtermpK[13]:=2.1; CtermpK[14]:=2.2; CtermpK[15]:=2.0; CtermpK[16]:=2.2;
   CtermpK[17]:=2.1; CtermpK[18]:=2.4; CtermpK[19]:=2.2; CtermpK[20]:=2.3;
Number:=Pos(Test,All);
   Number:=(Number+2) DIV 3;
   TermN:=NtermpK[Number];
   TermC:=CtermpK[Number];
    END; {Findend}
PROCEDURE Check (VAR Instr : Str3);
VAR Count : INTEGER;
BEGIN
   FOR count:= 1 TO 3 DO IF Instr[count] IN ['A'..'Z'] THEN
   Instr[count] := CHR(ORD(Instr[count])+32);
END; {Check}
PROCEDURE Input;
VAR n : 1..Max_chains;
BEGIN
   Write('Name of protein/peptide: ');
   Readln(Name);
   Writeln;
   Writeln('Please, enter the number of charged amino acids');
   Write('arg: '); Readln(Num_arg);
   Write('asp: '); Readln(Num_asp);
   Write('cys: '); Readln(Num_cys);
   Write('glu: '); Readln(Num_glu);
   Write('his: '); Readln(Num_his);
   Write('lys: '); Readln(Num_lys);
   Write('tyr: '); Readln(Num_tyr);
   Writeln;
   Write('Number of polypeptide chains: ');
   Readln(Num_chains);
   FOR n:=1 TO Num_chains DO
   BEGIN
      Write('Chain',n:2,' N-terminal amino acid: ');
      Readln(Test);
      Check(Test);
      Findend;
      C_term[n]:=TermC;
   END;
END; {Input}

Procedure Find_charge;
VAR I : 1..Max_chains;
BEGIN
Sum:=0;
Sum:=Sum-(Num_asp)*(Neg_charge(pKa3_asp,Test_pH));
Sum:=Sum-(Num_glu)*(Neg_charge(pKa3_glu,Test_pH));
Sum:=Sum-(Num_cys)*(Neg_charge(pKa3_cys,Test_pH));
Sum:=Sum-(Num_tyr)*(Neg_charge(pKa3_tyr,Test_pH));
Sum:=Sum+(Num_his)*(Pos_charge(pKa3_his,Test_pH));
Sum:=Sum+(Num_lys)*(Pos_charge(pKa3_lys,Test_pH));
Sum:=Sum+(Num_arg)*(Pos_charge(pKa3_arg,Test_pH));
FOR i:=1 to Num_chains DO
BEGIN
   Temp:=(Pos_charge(N_term,Test_pH));
      Sum:=Sum+Temp;
      Temp:=(Neg_charge(C_term,Test_pH));
      Sum:=Sum-Temp;
      Temp:=Sum;
END;
Charge[N]:=Temp;
END; {find_charge}
PROCEDURE Iterate;
VAR Factor, Diff : REAL;
BEGIN
N:=1; Test_pH[N]:=pH1; Find_charge;
N:=2; Test_pH[N]:=pH2; Find_charge;
REPEAT
N:=N+1;
BEGIN
 Factor:=((Test_pH[N-1])-(Test_pH[N-2]))/((Charge[N-1])-(Charge[N-2]));
 Test_pH[N]:=(Test_pH[N-1])-(Charge[N-1]*Factor);
 Find_charge;
 Temp:=(Charge[N])-(Charge[N-1]);
 Diff:=ABS(Temp);
END;
   UNTIL (Diff)<=(Accuracy);
   Writeln;
   Writeln('Isoelectric point calculated to pH= ',Test_pH[N]:5:3);
   Writeln;
   Writeln('Accuracy= ',Accuracy:7:5,' pH units');
   Writeln('Number of iterations= ',N);
END; {Iterate}

BEGIN {main}
   Input;
   Iterate;
END.
There are some mistakes on your source code.
My attached example for Lazarus compiles, but it needs correct values for input and output for testing it.

Just curious, after 3 decades since 1986, is this calculus still valid?

Code: Pascal  [Select]
  1. unit Unit1;
  2.  
  3. // Calculation of the pI of proteins according to Skoog and Wichman
  4. // https://forum.lazarus.freepascal.org/index.php/topic,45188.0.html
  5.  
  6. // Book:
  7. // TRAC: Trends in Analytical Chemistry, Volume 5
  8. // edited by C. J. W. Brooks, A. F. Fell, R. W. Frei
  9. // https://books.google.com.br/books?id=vIPNCgAAQBAJ&pg=PA83&lpg=PA83&dq=%22Isoelectric+point+determination+of+polypeptides%22&source=bl&ots=P6U2JITr8G&sig=ACfU3U1_--_v_3rglRuY7k93XmUXRNOqOg&hl=pt-BR&sa=X&ved=2ahUKEwiFmoer7uzhAhVBLLkGHRffBakQ6AEwAHoECAAQAQ#v=onepage&q=%22Isoelectric%20point%20determination%20of%20polypeptides%22&f=false
  10.  
  11. {$mode objfpc}{$H+}
  12.  
  13. interface
  14.  
  15. uses
  16.   Classes, SysUtils, FileUtil, Forms, Controls, Graphics, Dialogs, StdCtrls;
  17.  
  18. const
  19.    pKa3_arg       = 12.5;
  20.    pKa3_asp       = 3.9;
  21.    pKa3_cys       = 8.3;
  22.    pKa3_glu       = 4.3;
  23.    pKa3_his       = 6.0;
  24.    pKa3_lys       = 10.5;
  25.    pKa3_tyr       = 10.1;
  26.    Max_chains     = 5;
  27.    Max_iterations = 100;
  28.    pH1            = 1;
  29.    pH2            = 14;
  30.    Accuracy       = 0.0005;
  31.    All = 'alaargasnaspcysglngluglyhisileleulysmetpheproserthrtrptyrval';
  32.  
  33. type
  34.   Str3      = String[3];
  35.   Str80     = String[80];
  36.   Aminoacid = ARRAY [1..Max_chains] OF Double;
  37.   Turn      = ARRAY [1..Max_iterations] OF Double;
  38.   N_values  = ARRAY [1..20] OF Double;
  39.   C_values  = ARRAY [1..20] OF Double;
  40.  
  41. type
  42.  
  43.   { TForm1 }
  44.  
  45.   TForm1 = class(TForm)
  46.     Button1: TButton;
  47.     Edit1: TEdit;
  48.     Edit10: TEdit;
  49.     Edit11: TEdit;
  50.     Edit12: TEdit;
  51.     Edit2: TEdit;
  52.     Edit3: TEdit;
  53.     Edit4: TEdit;
  54.     Edit5: TEdit;
  55.     Edit6: TEdit;
  56.     Edit7: TEdit;
  57.     Edit8: TEdit;
  58.     Edit9: TEdit;
  59.     Label1: TLabel;
  60.     Label10: TLabel;
  61.     Label11: TLabel;
  62.     Label12: TLabel;
  63.     Label13: TLabel;
  64.     Label14: TLabel;
  65.     Label2: TLabel;
  66.     Label3: TLabel;
  67.     Label4: TLabel;
  68.     Label5: TLabel;
  69.     Label6: TLabel;
  70.     Label7: TLabel;
  71.     Label8: TLabel;
  72.     Label9: TLabel;
  73.     procedure Button1Click(Sender: TObject);
  74.   private
  75.  
  76.   public
  77.  
  78.   end;
  79.  
  80. var
  81.   Form1: TForm1;
  82.  
  83.   Name: Str80;
  84.   Test: Str3;
  85.  
  86.   Num_arg,
  87.   Num_asp,
  88.   Num_cys,
  89.   Num_glu,
  90.   Num_his,
  91.   Num_lys,
  92.   Num_tyr,
  93.   Num_chains: Integer;
  94.  
  95.   N_term, C_term: Aminoacid;
  96.   Test_pH, Charge: Turn;
  97.   TermN, TermC: Double;
  98.  
  99.   Number, N: Integer;
  100.  
  101.   NtermpK: N_values;
  102.   CtermpK: C_values;
  103.   Sum, Temp: Double;
  104.  
  105. implementation
  106.  
  107. {$R *.lfm}
  108.  
  109. function Neg_charge (pK_value: Double; pH: Turn): Double;
  110. begin
  111.   Neg_charge:=(exp(2.3026*((pH[N])-(pK_value))))/(1+exp(2.3026*((pH[N])-(pK_value))));
  112. end; {Neg_charge}
  113.  
  114.  
  115. function Pos_charge (pK_value: Double; pH: Turn): Double;
  116. begin
  117.   Pos_charge := (1/(exp(2.3026*(((pH[N])-(pK_value))))+1));
  118. end; {Pos_charge}
  119.  
  120.  
  121. procedure Findend;
  122. begin
  123.   NtermpK[01] := 9.9;
  124.   NtermpK[02] := 9.0;
  125.   NtermpK[03] := 8.8;
  126.   NtermpK[04] := 9.8;
  127.   NtermpK[05] := 10.8;
  128.   NtermpK[06] := 9.1;
  129.   NtermpK[07] := 9.7;
  130.   NtermpK[08] := 9.8;
  131.   NtermpK[09] := 9.2;
  132.   NtermpK[10] := 9.8;
  133.   NtermpK[11] := 9.7;
  134.   NtermpK[12] := 9.0;
  135.   NtermpK[13] := 9.3;
  136.   NtermpK[14] := 9.2;
  137.   NtermpK[15] := 10.6;
  138.   NtermpK[16] := 9.2;
  139.   NtermpK[17] := 9.1;
  140.   NtermpK[18] := 9.4;
  141.   NtermpK[19] := 9.1;
  142.   NtermpK[20] := 9.7;
  143.  
  144.   CtermpK[01] := 2.4;
  145.   CtermpK[02] := 2.2;
  146.   CtermpK[03] := 2.1;
  147.   CtermpK[04] := 2.1;
  148.   CtermpK[05] := 1.7;
  149.   CtermpK[06] := 2.2;
  150.   CtermpK[07] := 2.2;
  151.   CtermpK[08] := 2.4;
  152.   CtermpK[09] := 1.8;
  153.   CtermpK[10] := 2.3;
  154.   CtermpK[11] := 2.3;
  155.   CtermpK[12] := 2.2;
  156.   CtermpK[13] := 2.1;
  157.   CtermpK[14] := 2.2;
  158.   CtermpK[15] := 2.0;
  159.   CtermpK[16] := 2.2;
  160.   CtermpK[17] := 2.1;
  161.   CtermpK[18] := 2.4;
  162.   CtermpK[19] := 2.2;
  163.   CtermpK[20] := 2.3;
  164.  
  165.   Number := Pos(Test, All);
  166.   Number := (Number+2) DIV 3;
  167.   TermN  := NtermpK[Number];
  168.   TermC  := CtermpK[Number];
  169. end; {Findend}
  170.  
  171.  
  172. PROCEDURE Check (VAR Instr : Str3);
  173. var
  174.   Count: Integer;
  175. begin
  176.    FOR Count := 1 TO 3 DO
  177.      IF Instr[Count] IN ['A'..'Z'] THEN
  178.        Instr[Count] := CHR(ORD(Instr[Count])+32);
  179. end; {Check}
  180.  
  181.  
  182. procedure Input;
  183. var
  184.   n : 1..Max_chains;
  185. BEGIN
  186.   // Write('Name of protein/peptide: ');
  187.   // Readln(Name);
  188.   Name := Form1.Edit1.Text;
  189.   // Writeln;
  190.   // Writeln('Please, enter the number of charged amino acids');
  191.   // Write('arg: '); Readln(Num_arg);
  192.   Num_arg := StrToInt(Form1.Edit2.Text);
  193.   // Write('asp: '); Readln(Num_asp);
  194.   Num_asp := StrToInt(Form1.Edit3.Text);
  195.   // Write('cys: '); Readln(Num_cys);
  196.   Num_cys := StrToInt(Form1.Edit4.Text);
  197.   // Write('glu: '); Readln(Num_glu);
  198.   Num_glu := StrToInt(Form1.Edit5.Text);
  199.   // Write('his: '); Readln(Num_his);
  200.   Num_his := StrToInt(Form1.Edit6.Text);
  201.   // Write('lys: '); Readln(Num_lys);
  202.   Num_lys := StrToInt(Form1.Edit7.Text);
  203.   // Write('tyr: '); Readln(Num_tyr);
  204.   Num_tyr := StrToInt(Form1.Edit8.Text);
  205.   // Writeln;
  206.   // Write('Number of polypeptide chains: ');
  207.   // Readln(Num_chains);
  208.   Num_chains := StrToInt(Form1.Edit9.Text);
  209.   FOR n := 1 TO Num_chains DO
  210.   begin
  211.     // Write('Chain',n:2,' N-terminal amino acid: ');
  212.     // Readln(Test);
  213.     Test := InputBox('Inform Chain ', 'Chain ' + RightStr('0' + IntToStr(n),2) + ' N-terminal amino acid:', 'abc');
  214.     Check(Test);
  215.     Findend;
  216.     C_term[n] := TermC;
  217.     N_term[n] := TermN;
  218.  
  219.  
  220.     // Write('Chain',n:2,' C-terminal amino acid: ');
  221.     // Readln(Test);
  222.     Test := InputBox('Inform Chain ', 'Chain ' + RightStr('0' + IntToStr(n),2) + ' C-terminal amino acid:', 'abc');
  223.     Check(Test);
  224.     Findend;
  225.     C_term[n] := TermC;
  226.     // ***************************************
  227.   end;
  228. end; {Input}
  229.  
  230.  
  231. procedure Find_charge;
  232. var
  233.   i: 1..Max_chains;
  234. begin
  235.   Sum := 0;
  236.   Sum := Sum - (Num_asp)*(Neg_charge(pKa3_asp,Test_pH));
  237.   Sum := Sum - (Num_glu)*(Neg_charge(pKa3_glu,Test_pH));
  238.   Sum := Sum - (Num_cys)*(Neg_charge(pKa3_cys,Test_pH));
  239.   Sum := Sum - (Num_tyr)*(Neg_charge(pKa3_tyr,Test_pH));
  240.   Sum := Sum + (Num_his)*(Pos_charge(pKa3_his,Test_pH));
  241.   Sum := Sum + (Num_lys)*(Pos_charge(pKa3_lys,Test_pH));
  242.   Sum := Sum + (Num_arg)*(Pos_charge(pKa3_arg,Test_pH));
  243.   FOR i := 1 to Num_chains DO
  244.   begin
  245.     Temp :=(Pos_charge(N_term[i], Test_pH));
  246.     Sum  := Sum +Temp;
  247.     Temp := (Neg_charge(C_term[i], Test_pH));
  248.     Sum  := Sum - Temp;
  249.     Temp := Sum;
  250.   end;
  251.   Charge[N] := Temp;
  252. end; {find_charge}
  253.  
  254.  
  255. procedure Iterate;
  256. var
  257.   Factor, Diff: Double;
  258. BEGIN
  259.   N          := 1;
  260.   Test_pH[N] := pH1;
  261.   Find_charge;
  262.  
  263.   N          := 2;
  264.   Test_pH[N] := pH2;
  265.   Find_charge;
  266.   REPEAT
  267.     N := N+ 1;
  268.     begin
  269.       Factor     := ((Test_pH[N-1])-(Test_pH[N-2]))/((Charge[N-1])-(Charge[N-2]));
  270.       Test_pH[N] := (Test_pH[N-1])-(Charge[N-1]*Factor);
  271.       Find_charge;
  272.  
  273.       Temp := (Charge[N])-(Charge[N-1]);
  274.       Diff := ABS(Temp);
  275.     end;
  276.   UNTIL (Diff)<=(Accuracy);
  277.  
  278.   // Writeln;
  279.   // Writeln('Isoelectric point calculated to pH = ', Test_pH[N]:5:3);
  280.   Form1.Edit10.Text := FormatFloat('00000.000', Test_pH[N]);
  281.   // Writeln;
  282.   // Writeln('Accuracy = ', Accuracy:7:5, ' pH units');
  283.   Form1.Edit11.Text := FormatFloat('0000000.00000', Accuracy);
  284.   // Writeln('Number of iterations = ', N);
  285.   Form1.Edit12.Text := IntToStr(N);
  286. end; {Iterate}
  287.  
  288. { TForm1 }
  289.  
  290. procedure TForm1.Button1Click(Sender: TObject);
  291. begin
  292.   Input;
  293.   Iterate;
  294. end;
  295.  
  296. end.

s_bue

  • New member
  • *
  • Posts: 6
Re: Calculation of the pI of proteins according to Skoog and Wichman
« Reply #14 on: April 26, 2019, 01:27:30 pm »
There are some mistakes on your source code.
My attached example for Lazarus compiles, but it needs correct values for input and output for testing it.

Just curious, after 3 decades since 1986, is this calculus still valid?

My supervisor and a prof. requested to calculate it like that (because they say it's accurate) but as I tried out two sequences it shows major differences to the ones you get from https://web.expasy.org/compute_pi/  (which is common practice). Here you just insert the letters for the amino acids if you want to double check:

Alanin                   Ala   A   
Arginin               Arg   R   
Asparagin               Asn   N   
Asparaginsäure      Asp   D   
Cystein               Cys   C   
Glutamin               Gln   Q   
Glutaminsäure       Glu   E   
Glycin               Gly   G
Histidin               His   H   
Isoleucin               Ile   I   
Leucin               Leu   L   
Lysin                       Lys   K   
Methionin               Met   M   
Phenylalanin       Phe   F   
Prolin               Pro   P   
Serin                       Ser   S   
Threonin               Thr   T   
Tryptophan       Trp   W   
Tyrosin               Tyr   Y   
Valin                       Val   V

I tried with Bradykinin (ArgProProGlyPheSerProPheArg --> RPPGFSPFR ) calculated by lazarus: pI 9.605 calculated by expasy: 12.00
Also I tried with a C-Phycocyaninbetasubunit (MetPheAspAlaPheThrLysValValSerGlnAlaAspThrArgGlyGluMetLeuSerThrAlaGlyIleAsp -->
MFDAFTKVVS QADTRGEMLS TAQID) calculated by lazarus was something really high while expasy comes to: 4.23
If you want to get amino acid sequences you can find all of them at https://www.uniprot.org/

I don't know your knowledge about Peptides the N terminus starts at the left hand side and the C terminus at the right hand side just in case you want to check the code by yourself. 

Now I have some questions. Might it be that the sequences are too big because max_chains is set to 5?
Also it is quite difficult to enter the number of tyrosins because it's on the bottom of the box. How do I move it further up?
Last but not least I want to change the way the program asks for the c and n subunits (first all the n subunits and afterwards all c subunits) is that possible?

Thank you all so far! It's really a great help!  :-[ :-*