Research Article | Volume 11, Issue 6, November, 2023

Evaluation and comparison of 15 short tandem repeat loci of south and west Indian population for use in personal identification applications

Prabakaran Mathiyazhagan Thangaraju Palanimuthu Agasthi Padmanathan   

Open Access   

Published:  Oct 25, 2023

DOI: 10.7324/JABB.2023.145640
Abstract

The objective of this research was to conduct an analysis of the frequency of alleles as well as forensically relevant parameters such as match probability, power of exclusion, polymorphic information content, observed, and expected heterozygosity, homozygosity, paternity index, and discrimination capacity (DC). In all, 209 unrelated people from South Indian groups and 201 people from West Indian populations were examined, and their frequency and statistical attributes were compared. In this research, 15 short tandem repeats (STRs) loci were examined utilizing a multiplex PowerPlex 16 System kit. These loci were CSF1PO, TPOX, D8S1179, TH01, Penta E, D18S51, D21S11, D3S1358, FGA, Vwa, Penta D, D16S539, D7S820, D13S317, and D5S818. With the help of the FORSTAT program, 15 STR frequencies and forensic significance parameters were computed. Within the population, there was no evidence of any departure from the Hardy–Weinberg equilibrium. The maximal allele frequency at TPOX was found to be 0.4976 in the South Indian population, 0.4925 in the West Indian population, and 0.5040 in the combined Indian population, respectively. Similar to the previous statement, the minimum allele frequency was seen at several loci throughout all three investigations. The Penta E and TPOX were found to be the locus with the highest and the lowest degree of polymorphism in the south and combined Indian population, respectively; whereas the locus D18S51 and TPOX recorded the highest and lowest degree of polymorphism in the West Indian population.


Keyword:     Short tandem repeat FORSTAT Hardy–Weinberg Indian population


Citation:

Mathiyazhagan P, Palanimuthu T, Padmanathan A. Evaluation and comparison of 15 short tandem repeat loci of south and West Indian population for use in personal identification applications. J App Biol Biotech. 2023;11(6):216-227. http://doi.org/10.7324/JABB.2023.145640

Copyright: Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike license.

HTML Full Text

1. INTRODUCTION

India has the greatest polygenetic population of any secular nation. Known to be a cultural melting pot, India is home to people of many different languages, cultural backgrounds, and religious traditions [1]. There are 4693 recognized demographic subgroups in India. They are comprised 2205 recognized communities, 589 subgroups, and 1900 separate administrative regions [2]. South India consists of the Indian regions of Andhra Pradesh, Karnataka, Kerala, Tamil Nadu, and Telangana, as well as the Union territories of Lakshadweep and Puducherry. As of April of 2020, there were 253,051,953 people living in 635,780 km2 or 245,480 sq miles [3]. The western part of India includes the states of Maharashtra, Rajasthan, Gujarat, and Goa. As of April 2020, there were 173,343,821 people living in 508,032 km² or 196,152 sq miles [4].

Forensic scientists now commonly utilize DNA analysis based on short tandem repeats (STRs) to determine paternity and to examine more complex criminal cases, such as those involving rape, murder, and mass rape [5]. Due to their high diversity and widespread distribution throughout the human genome, short tandem repeats (STRs) are useful genetic identifiers for identifying specific people. Multiplex polymerase chain reaction (PCR) has made it possible to amplify numerous loci at once, making STR typing a viable tool for human identification [6]. Autosomal STR loci are widely employed in forensics and paternity testing [7,8]. The odds of two unrelated people having identical STR profiles, or fingerprints, are “about one in 57 trillion,” as shown by the frequencies of the STR alleles [9]. Several nations now have their own forensic DNA databases. To prove the identification of missing individuals, victims, criminals, under trial, unknown dead persons, etc., the “DNA Technologies (Use and Application) Regulation Bill – 2019” was drafted in India. The Bill includes language creating a DNA Regulatory Board (DRB) charged with performing the duties and exercising the authority specified in the Bill. The national forensic DNA database for the purpose of identifying the aforementioned class of individuals will be maintained through the establishment of National and Regional DNA Data Banks, which are also provided for in the bill. This will help improve the scientific level and efficiency of DNA testing in the India [10]. The Indian government is now debating the bill, which is evidence that a comprehensive regional and state-level DNA database is necessary. studied genetic polymorphism of 15 STR loci in 123 unrelated individuals of Madhya Pradesh population and evaluated autosomal STR diversity and forensic important parameters. They concluded that locus Penta E showed the highest power of discrimination (0.978) the highest polymorphism information content (0.900) and also suggested all loci are highly polymorphic, showed significant genetic diversity, and have the potential for forensic application carried out the research to investigate the genomic diversity and population structure in the Muslim community of Telangana using 23 autosomal STR in 184 randomly selected unrelated healthy individuals and concluded that locus SE33 showed 37 observed alleles, which is the highest number of observed alleles among all the studied loci and SE33 showed the highest number of observed alleles (37) among all the studied loci. The locus SE33 was the most polymorphic and discriminatory locus and may be useful for forensic application. Similar to these studies, lot of research had been done using STR focusing on specific community such as Muslim community of Telangana, Teli population of Maharashtra, Mina, Gujjar, and the admixed population of Rajasthan and Sikh population of central India. In this research, it is attempted to cover two regions of India (South and West) and also compared their genetic parameters and frequency that can be applied in forensic application. In India to prove the identification of missing individuals, victims, criminals, under trial, unknown dead persons, etc., the ”DNA Technologies using STR has not been used as prevalent as in other countries. The objective of the research was to determine the allele frequencies and statistical parameters for medicolegal interest of 15 loci Penta E, D18S51, D21S11, TH01, D3S1358, FGA, TPOX, D8S1179, Vwa, Penta D, CSF1PO, D16S539, D7S820, D13S317, and D5S818 in subjects with a large population of unrelated south and West Indian population. In this research, 410 buccal swab samples were collected from unrelated individuals of south and West Indian states after getting their informed consent forms. From the collected samples, DNA was extracted and stored in –21°C. The extracted DNA samples were quantified to analyze their DNA concentration by measuring the absorbance at 260 nm (A260) in a spectrophotometer using a quartz cuvette. The quantified DNA was ready to be used for the DNA amplification. Next, amplification process was carried out using PP16 multiplex primer. The amplified PCR samples were engaged into DNA sequencing using ABI Applied BioSystem 3130 sequencer. Once the samples were sequenced, the data were analyzed with the use of GeneMapper software and, subsequently, the frequency and forensic significant parameters calculated using the FORSTAT software. The maximal allele frequency at TPOX was found to be 0.4976 in the South Indian population, 0.4925 in the West Indian population, and 0.5040 in the combined South and West Indian population, respectively. The sequence-based allelic frequencies for south and West Indian population groups studied here with STR would enable equipped laboratories to use the frequencies and forensic important parameters obtained from this study for relationship or forensic identification. This work complements other available sequence-based allelic frequency databases published by other studies carried out in South and West Indian states, but also fills a large data gap for the South, and West Indian populations. In addition, the similarities and differences have been compared between these two populations that would enable forensic and relationship testing industry to use these parameters confidently when analyzing live forensic cases are done using STR for any type of paternity, maternity, missing person analysis, etc. The key limitations from the previous studies were to collect the samples from different parts of the states in India and high expenses to analyze the more number of samples. The advantage of this study aside from being the new approach to compare the South and West Indian population with 15 STR loci and it can also be used by the DNA analysts in South and West Indian laboratories for live forensic and relationship casework, as well as for further research outside.


2. MATERIALS AND METHODS

Buccal swab samples of 410 healthy unrelated individuals 209 (123 males and 86 females) and 201 (108 males and 102 females) between the age group of 17 and 23 from South and West Indian states were collected respectively after getting their informed consent [Figure 1]. The work was authorized by the Ethics Committee of SRM Institute of Science and Technology–Chennai. (1887/IEC/2020).

Figure 1: South and West Indian states from which the samples were collected.



[Click here to view]

2.1. DNA Extraction and Quantification

Using a swab solution kit that was given by Promega, from Madison, United States, genomic DNA was extracted from the buccal swab.

2.2. Selection of STR Markers

PP16 STRs were chosen for this research due to their polymorphism. The PowerPlex 16 System is a multiplex STR system for use in DNA typing, including paternity testing, forensic DNA analysis, human identity testing, and strain identification. This system allows coamplification and three-color detection of 16 loci (fifteen STR loci and Amelogenin): Penta E, D18S51, D21S11, TH01, D3S1358, FGA, TPOX, D8S1179, Vwa, Amelogenin, Penta D, CSF1PO, D16S539, D7S820, D13S317, and D5S818. One primer for each of the Penta E, D18S51, D21S11, TH01, and D3S1358 loci is labeled with fluorescein (FL); one primer for each of the FGA, TPOX, D8S1179, Vwa, and Amelogenin loci is labeled with carboxy-tetramethylrhodamine (TMR); and one primer for each of the Penta D, CSF1PO, D16S539, D7S820, D13S317, and D5S818 loci is labeled with 6-carboxy-4´,5´-dichloro-2´,7´-dimethoxy-fluorescein (JOE). All 16 loci are amplified simultaneously in a single tube and analyzed in a single injection. The alleles of the STR loci may be distinguished from one another based on the amount of copies of the repeat sequence that is included inside each STR locus. The discriminatory capacity tends to increase with the number of STR loci employed in the typing process. Since it is extremely unlikely that a single individual will share an identical STR profile with another individual drawn at random from the population, the likelihood that a single individual will have an identical STR profile with another individual drawn at random from the population is extremely low [11].

2.3. PCR Amplification

Amplification by multiplexed PCR was carried out with the use of the PowerPlex 16 System (Manufactured by Promega in Madison, United States). PCR was done in a 25ul volume with 2ng template DNA and the following amplification conditions: An initial incubation at 95°C for 11 min, 28 cycles of denaturation at 94°C for 1 min, annealing at 59°C for 1 min, and extension at 72°C for 1 min; and a final extension at 60°C for 45 min, as per manufacturer’s instructions.

2.4. Control Samples

For positive control, PowerPlex® 16 System-supplied DNA was used, which produced interpretable results in each run throughout the process.

For negative control, nuclease-free water was used in each run throughout the process.

For quality control, internally processed DNA (a known DNA profile) was used in each run throughout the process.

2.5. STR Typing of PowerPlex-Amplified Samples

Using an ABI Prism 3130 Avant Genetic Analyzer, multicapillary electrophoresis of the amplification product was carried out (Applied BioSystems, Foster City, CA, and USA). The data were examined using the ABI Prism GeneMapper version 3.0 software (Applied BioSystems, and the relative peak area was automatically determined by the program to identify alleles by comparison with the allelic ladder that was given with the kit. The allele nomenclature was established in accordance with the guidelines provided by the International Association for Forensic Hemogenetics [12]. For the purpose of allele identification, the peak detection threshold was set to 50 RFUs. Every step was carried out in the manner prescribed by the laboratory’s internal standards as well as the manual and procedure for the relevant kit controls.

2.6. Analysis of the Data

Forensic and population genetics metrics such as allele frequency, matching probability, polymorphism information content, expected and observed heterozygosity and homozygosity, power of exclusion, discriminatory capacity, and paternity index were determined using FORSTAT (Forensic Statistic Analysis Toolbox) FORSTAT web tool which implements the common GENEPOP format to evaluate forensic genetics statistics. The simplicity and ease-of-use are achieved by automation of data entry and calculations which reduce analysis time and labor inputs [13]. In FORSTAT, the analyzed sample profiles were entered into and required data such as frequency and forensic parameters automatically analyzed by the software and results obtained.


3. RESULTS

The allele frequencies calculated for South, West, and combined South and West Indian population are given in Tables 1-3, respectively.

Table 1: Allele frequencies in South Indian population.

AllelesD3S1358TH01D21S11D18S51Penta_ED5S818D13S317D7S820D16S539CSF1POPenta_DVwaD8S1179TPOXFGA
2.20.0048
50.00240.09090.0024
60.18890.00720.0024
70.27510.16510.00480.0024
80.15070.00960.04780.12680.01430.02150.00480.02150.00720.4976
90.17220.01670.04540.08610.05500.16270.01670.23200.00240.1507
9.30.1986
100.01190.00720.08850.07180.05980.32060.09330.21770.09570.06690.0454
110.01430.09090.23680.31330.30380.30140.33010.13160.06940.2536
120.09810.20330.34210.23440.26550.21530.35170.21530.13160.0454
130.00480.09810.08370.23680.13160.04070.17940.06940.21290.26080.0024
140.08370.14110.07170.01430.04540.02150.00710.05980.09330.2440
150.30860.17220.03590.00480.00240.00240.00240.01430.16510.1698
15.20.0024
15.40.0024
160.28710.14110.04310.22970.0431
170.19620.14110.04780.23200.0048
180.11240.08130.02390.18420.0119
18.20.0072
190.00480.04060.01430.08130.0526
19.20.0024
200.00240.03830.00720.00720.0885
20.20.0024
210.00960.00240.00720.1555
220.00960.00240.2009
22.20.0024
230.02390.1531
240.00240.1531
250.0933
260.00240.0454
270.03830.0287
280.1746
290.1794
29.20.0024
300.24160.0024
30.20.0311
310.0718
31.20.0885
320.0143
32.20.1124
330.0024
33.10.0048
33.20.0239
340.0024
350.0072
360.0024
Observed Alleles 46
Range of Alleles13–205-1026-2610-245-228-158-158-135-158-152.2-1514-218-176-1318-30
Recorded Alleles Each Locus871716188869811810815Total 157

Table 2: Allele frequencies in West Indian population.

AllelesD3S1358TH01D21S11D18S51Penta_ED5S818D13S317D7S820D16S539CSF1POPenta_DVwaD8S1179TPOXFGA
2.20.0199
3.20.0025
50.04230.0124
60.23630.00250.0099
70.26860.12190.01490.01990.01490.00250.0025
80.10940.02230.01240.12190.16660.01490.00250.01990.01490.4925
90.21890.01240.06220.16420.09700.20150.01990.24870.00250.0995
9.30.1567
100.00750.00490.09450.04480.09200.30350.14680.24130.15170.08210.0472
110.00250.00490.09450.34330.22140.25620.26370.27860.15420.00250.07210.2463
120.00250.09700.18660.35570.24130.12930.22640.37560.16660.11940.1019
130.00750.11940.0920.15670.09200.02740.11690.06220.13930.00490.2910
13.20.0025
140.10190.17160.05970.00990.06220.02980.00490.07210.10940.2214
14.20.0025
150.31590.14920.07710.00490.00490.07210.1318
160.27610.13430.05470.00490.25870.0572
170.19150.14430.05470.00250.26860.00750.0025
17.20.0025
180.07710.03230.16420.0025
190.09700.04480.01990.08950.0622
200.00750.01490.01990.02490.0920
20.30.0075
210.01990.00490.00490.1691
220.01240.00250.1866
22.20.0025
230.00250.1592
240.1468
250.00250.00490.0796
25.20.0025
260.0572
270.02490.0224
280.12690.0025
290.20150.0025
300.26120.0025
30.20.0223
310.0696
31.20.0920
320.0249
32.20.1069
330.0075
33.20.0422
Observed alleles 42
Range of Alleles12-206-1125-33.210-225-257-148-157-138-147-142.2-1711-218-186-1217-30
Recorded alleles Each locus9712141988778151011717Total 159

Table 3: Allele frequencies in combined South and West Indian population.

AlleleD3S158TH01D21S11D18S51Penta_ED5S818D13S317D7S820D16S539CSF1POPenta_DVwaD8S1179TPOXFGA
2.20.0101
3.20.0010
50.00100.07530.00100.0030
60.20670.00610.0050
70.27490.12010.01320.00910.01010.00300.0030
80.13340.01420.02850.13740.09260.01620.00400.01930.00910.5040
90.15780.01520.04780.12620.08140.18020.02640.24230.0030.1171
9.30.2169
100.00810.00400.08450.06000.07940.31360.12720.2270.13740.07630.0448
110.00100.00810.10180.29420.26880.27900.27900.29530.15170.00200.06920.2545
120.09260.17820.35530.23420.18940.22190.35940.18530.00100.12010.0702
130.00710.11810.07940.18730.10380.03360.14660.06820.16800.00300.26980.0010
13.20.0020
140.09360.15370.06310.01120.04480.02640.00710.05900.10080.2474
150.31460.16800.06310.00200.00300.00100.00200.01010.12320.1507
160.28100.13230.05090.00200.24540.0468
170.19140.14460.05490.00100.24640.00610.0010
17.20.0020
180.10480.07940.03860.17710.00100.0142
18.20.0040
190.00610.04270.02240.08140.0570
19.20.0010
200.00100.02640.01520.01420.0896
20.20.12490.0020
210.01320.00610.00500.1517
220.01220.00610.1914
22.20.0020
230.00200.00500.1517
240.00100.1527
250.00300.0916
25.20.0010
260.0498
270.02850.0264
280.15270.0030
290.20360.0020
29.20.0010
300.25450.0030
30.20.02340.0010
30.30.0010
310.0692
31.20.0865
320.0173
32.20.1059
330.0061
33.20.03760.0010
340.0020
34.20.0010
350.0061
360.0050
Observed Alleles 52
Range of Alleles13-20.25-1127-3610-245-257-158-157-135-157-152.2-1711-218-186-1317-30.2
Recorded Alleles Each Loci9817161998799151111822Total 182

Minimum allele frequency of 0.0024 was found in different alleles which are mentioned below:

D3S1358-20 TH01-5, D21S11-26, 29.2, 33, 34, and 36 D18S51-15.2 and 24 Penta E-15.4, 21 and 22 D13S317-15 D16S539-5 and 15, CSF1PO-15 D18S1179-9 TPOX-6,7, and 13 FGA-19.2, 20.2, 22.2, and 30.

Minimum allele frequency of 0.0025 was found in different alleles which are mentioned below:

D3S1358- 12 TH01-11, D21S11- 25 D18S51-13.2 and 14.2 Penta E-22 and 23 CSF1PO-8 Penta D- 3.2, 6, 7 and 17 Vwa- 11 D8S1179-9 and 18 TPOX-7 FGA-17, 17.2, 22.2, 25.2, 28, 29, and 30.

Minimum allele frequency of 0.0010 was found in different alleles which are mentioned below:

D3S1358- 20 TH01-5, 11, D21S11-29.2, 30.3 and 34.2 D18S51-24, D16S539-5,15 Penta D-3.2, 17 Vwa-12, D8S1179-18, TPOX-13, and FGA 17,19.2,25.2,30.2, and 33.2.

By comparing the alleles in the population groups, the following alleles 15.2, 15.4, 18.2, 19.2, 20.2, 29.2, 33.1, 34, 35, and 36 were observed in South Indian while they were not found in the West Indian population; whereas the alleles 3.2, 13.2, 14.2, 17.2, 20.3, and 25.2 were recorded in the West Indian population but were not observed in the South Indian population. On the other hand, the following alleles 10.3, 15.3, 16.2, 21.2, and 23.2 were found in the combined population but were not seen either in South or West Indian population.

A total of 157, 159, and 182 alleles were recorded in the South, West, and combined South and West Indian populations, respectively. Alleles were counted, and the corresponding numbers were 46, 42, and 54. In TPOX, the allele frequencies in the South Indian population varied from 0.0024 to 0.4976, those in the West Indian population from 0.0024 to 0.4925, and those in the combined Indian population from 0.0010 to 0.5040. TPOX locus had the highest frequency in all three sample populations. Allele frequencies varied from 6 (D7S820) to 18 (Penta E) in the South, 7 (THO1, D7S820, D16S539, and TPOX), to 19 (Penta E) in the West, and 8 (TH01, D13S317, D7S820, and TPOX) to 23 (FGA) in the South and West Indian population combined. The West Indians had the highest frequency of alleles at the FGA locus (19/42=45.2%), followed by the combined South and West Indian population also at the same FGA locus (23/54=42.6%), whereas the South Indians had at the Penta E locus (18/46 = 39.1%).

Analyses of the range and number of alleles at each locus for the South, West, and combined Indian population revealed that the maximum range at the same locus Penta E was 5-22, 10-25, and 5-25 and the number of alleles were 18 and 19 for South and West Indian population, respectively, whereas the FGA locus recorded the maximum of 23 alleles in the combined Indian population.

Most common allele (MCA) and least common allele (LCA) observed at different loci are given in Table 4.

Table 4: MCA and LCA alleles in South, West, and combined Indian populations.

STR LociSouth Indian populationWest Indian populationCombined South and West Indian population



MCALCAMCALCAMCALCA
D3S1358*15 (129)20 (1)15 (127)12 (1)15 (309)20 (1)
TH019.3 (83)5 (1)7 (108)11 (1)7 (270)5, 11 (1)
D21S11*30 (101)26, 29.2, 33 34, 36 (1)30 (105)25 (1)30 (250)29.2, 30.3, 34.2 (1)
D18S5115 (72)15.2, 24 (1)14 (69)13.2 (1), 14.215 (165)24 (1)
Penta_E*12 (85)15.4, 21, 22 (1)12 (75)22, 23 (1)12 (175)19.4 (2)
D5S818*12 (143)15 (2)12 (143)14 (4)12 (349)15 (2)
D13S31711 (131)15 (1)12 (97)15 (2)11 (264)15 (3)
D7S820*10 (134)8 (6)10 (122)7 (8)10 (308)7, 8 (9)
D16S539*11 (126)5, 15 (1)11 (106)8 (6)11 (274)5, 15 (1)
CSF1PO*12 (147)15 (1)12 (151)8 (1)12 (353)15 (2)
PentaD12 (90)2.2, 7 (2)9 (100)3.2, 6, 7, 17 (1)9 (238)3.2, 17 (1)
Vwa*17 (97)20, 21 (3)17 (108)11 (1)17 (242)12 (1)
D8S1179*13 (109)9 (1)13 (117)9, 18 (1)13 (265)18 (1)
TPOX*8 (208)6, 7, 13 (1)8 (198)7 (1)8 (495)13 (1)
FGA*22 (84)19.2, 20.2, 22.2, 30 (1)22 (75)17, 17.2, 22.2, 25.2, 28, 29, 30 (1)22 (188)17, 19.2, 25.2, 30.2, 33.2 (1)
173930170127414525
Total176917284170

* The alleles in the 11 marked loci occurred maximum number of times in South, West, and in the combined Indian population (D16S539, CSF1PO, Vwa, D3S1358, D21S11, D8S1179, TPOX, Penta E, D5S818, D7S820, and FGA). The allele 12 is the most predominant in South, West, and combined populations occurring 465/1769 times (26.28%) in South and 467/1728 counts (27.02%) in West and 877/4170 counts (21.03%) in the combined Indian population.

The least common alleles occurred in many of the loci only one time.

The polymorphism values of STRs expressed by various statistical parameters are presented in Tables 5 for south, west, and combined Indian population.

Table 5: Statistical parameters of STR loci for South, West, and combined South and West Indian population.

LOCID3S1358TH01D21S11D18S51Penta_ED5S818D13S317D7S820D16S539CSF1POPenta_DVwaD8S1179TPOXFGA
MP (South)0.11660.10890.08160.05970.04440.16600.08470.11480.09870.14970.08690.09610.08780.19060.0670
MP (West)0.13500.10750.08250.07300.07100.13800.11950.15800.11700.18400.09550.10850.10550.20050.0830
MP (Combined)0.10750.08990.06270.04540.03430.12530.07840.10540.08550.13700.07050.08170.07310.18050.0512
PIC (South)0.71340.74270.80340.85540.89400.66660.78750.74230.76350.67210.78880.76950.77450.60390.8408
PIC (West)0.70610.74450.80850.85160.84940.70140.75000.66680.74290.63680.77850.76690.77080.59350.8323
PIC (Combined)0.71910.75200.81730.86350.89160.69540.78540.73100.76500.67250.79590.78260.78470.60740.8471
Hexp (South)0.75290.77830.82470.86920.90170.71260.81300.77560.79420.71790.81350.79790.80100.65030.8570
Hexp (West)0.74720.77900.82880.86580.86300.74170.78050.71740.77570.69330.80530.79650.79900.64750.8495
Hexp (Combined)0.75790.78420.83340.87570.89880.73410.81620.76880.79390.72110.82110.80810.81110.64990.8610
Hobs (South)0.73930.76430.84260.88570.90710.72140.83210.82500.78210.74290.83930.77850.79640.64640.8607
Hobs (West)0.71000.75500.84000.84500.85000.74500.81500.73000.76000.74500.85000.82500.79000.65500.8600
Hobs (Combined)0.72860.75920.83670.86730.88570.72650.82240.78160.77140.74080.84690.80200.79180.65100.8612
Homozygosity (South)0.26070.23570.15710.11430.09290.27860.16780.17500.21790.25710.16070.22140.20360.35360.1393
Homozygosity (West)0.27140.24080.16330.13270.11430.27350.17760.21840.22860.25920.15310.19790.20820.34890.1388
Homozygosity (Combined)0.27140.24080.16330.13270.11430.27350.17760.21840.22860.25920.15310.19790.20820.34890.1388
Paternity Index (South)1.98562.38724.46005.45006.34672.04063.15563.02332.63052.11953.83002.58562.66561.51064.2600
Paternity Index (West)1.81352.32623.45003.73334.08332.40272.98332.11592.46792.8123024.20002.95002.61671.58024.1666
Paternity Index (Combined)1.86622.15233.21344.34394.57631.93802.97492.46252.31762.08003.49422.60182.48211.50583.7888
PE (South)0.49620.54190.68520.76860.81070.47920.66680.64810.57520.50640.67830.56630.59660.36490.7192
PE (West)0.45220.52580.67770.68890.69750.51940.63160.49770.53960.52020.70020.64760.58520.37690.7161
PE (Combined)0.47550.52830.67070.73410.77310.48590.64920.56930.54250.48390.69680.60550.57680.36750.7143
DC (South)0.88340.89100.91840.94030.95560.83390.91530.88520.90130.85020.91300.90380.91220.80940.9329
DC (West)0.86500.89250.91750.92700.92900.86200.88050.84200.88300.81600.90450.89150.89450.79950.9170
DC (Combined)0.89250.90990.93540.95520.96590.87260.92320.89510.91420.86200.92980.91610.92750.81570.9489
HWE (South)0.07370.89760.95990.61550.89460.79840.38160.71920.50880.74440.59440.67430.92330.08670.9432
HWE (West)0.06930.84540.93540.60190.87650.76810.37560.70900.49890.73110.5859=80.66460.91220.07670.9313
HWE (Combined)0.08110.87210.94760.80870.88570.79330.37880.71360.54860.73760.59020.66960.91780.08270.9373

MP: Matching Probability, PIC: Polymorphism Information Content, Hexp: Expected Heterozygosity, Hobs: Observed Heterozygosity, h: homozygosity, P:- Paternity Index, PE: Power of Exclusion, DC: Discrimination Capacity and HWE stands for Hardy–Weinberg Equilibrium.

According to the above data, the lowest match probability was 0.0444, 0.0710, and 0.0343 in PENTA E in the South, West, and combined Indian populations, respectively, while the corresponding highest match probability was 0.1906, 0.2005, and 0.1805 at TPOX. The highest values of discrimination capacity were found at the locus Penta E (0.9556, 0.9290, and 0.9659) and the lowest values at the locus TPOX (0.8094, 0.7995, and 0.8157) in South, West, and combined Indian populations, respectively.

The lowest polymorphic information content (PIC) values were 0.6039, 0.5935, and 0.6074 at locus TPOX in the South, West, and combined Indian populations, respectively, and the corresponding highest values were 0.8940 and 0.8916 at Penta E in the south and combined Indian populations, and 0.8516 at locus D18S51 in the west.

In the South, West, and combined Indian populations, the lowest expected heterozygosity (Hexp) values were 0.6503, 0.6475, and 0.6499, respectively, at the locus TPOX, and the maximum values of 0.9017 and 0.8988 at locus Penta E were recorded in South and combined Indian populations and 0.8658 in the locus D18S51 in the West Indian population. The minimum observed heterozygosity (Hobs) values were 0.6464, 0.6550, and 0.6510 in the South, West, and combined Indian populations, respectively; whereas the corresponding maximum observed heterozygosity values were 0.9071 and 0.8857 in the South and combined Indian populations and 0.8600 at FGA locus in the west. The minimum homozygosity values of 0.0929 and 0.1143 were found at locus Penta E in the South and combined Indian populations, respectively, while 0.1400 was found at locus FGA in the west. Among the South, West, and combined Indian populations, the highest homozygosity values were 0.3536, 0.3450, and 0.3489, respectively at the locus TPOX. The Power of Exclusion (PE) minimum values of 0.3649, 0.3769, and 0.3675 were recorded at locus TPOX in the South, West, and combined Indian populations, respectively, whereas the corresponding maximum values of 0.8107 and 0.7731 were noted at Penta E in the south and combined Indian populations, respectively, and in the west, it was 0.7161 at FGA locus. The paternity index (PI) minimum values of 1.5106, 1.5802, and 1.5058 were observed at locus TPOX in South, West, and combined Indian populations, respectively, while the maximum values of 6.3467 and 4.5763 were observed in Penta E in the South and combined Indian populations, respectively, while in the west, it was 4.2000 in Penta D locus. In both populations, no divergence from Hardy–Weinberg equilibrium was detected (P > 0.05).

Among tables presented in this study, the frequency tables of south and west (1 and 2) Indian frequencies are the most important ones as they have been highlighted the frequency of South and West Indian population separately which can be used to potential relationship analysis when the live samples analyzed from those states.


4. DISCUSSION

The accuracy of the findings is generally improved by gathering data from a larger sample size. Prior research revealed that calculating the probability using just 100–150 tested people per population [14]. For this reason, a total of 410 people from the South and West Indian populations were evaluated to determine allele frequency analyses and statistical parameters of forensic relevance.

4.1. Alleles and Allele Frequencies

In South, West, and combined Indian population, the number of actual observed alleles were 46, 42, and 54, respectively. Analyzed genetic diversity of autosomal STRs in 11 population of India and found 43 and 38 alleles in South and West Indian population, respectively, which is very close to our results [15]. In South, West, and combined Indian population, the total alleles recorded were 157, 159, and 182, respectively. The percentages of alleles were 75.1% (157/209) in South, 79.1% (159/201) in West Indian population. Carried out population genetic analyses of 22 autosomal STRs in Indian populations with 357 individuals from 11 states across India and recorded 275 alleles which worked out to 77% (275/357) [16] which corroborated to the values of this study. The maximum of 18 alleles in south and 19 in the west were observed in Penta E locus and 23 alleles were found in FGA in the combined Indian population indicated the most polymorphic nature of Penta E and FGA. Similarly, researched on genomic diversity in Maharashtra population with 20 autosomal markers highlighted that locus Penta E had the maximum number of alleles in the admixed and Teli population [17]. The maximum allele frequency of 0.4976, 0.4925, and 0.5040 were observed at TPOX in all three populations groups, respectively, whereas the minimum allele frequency was observed in several loci in South, West, and combined Indian population. Similarly, had studied the genetic variation of 15 autosomal microsatellite loci in Tamil population and observed the maximum frequency of 0.415 at TPOX which is more or less similar to our study. Allele 8 was recorded the maximum frequency in all the three populations [18]. Likewise, worked on genetic polymorphism of eleven STR loci in Rajput population of Delhi and stated that allele 8 of locus TPOX showed the highest frequency 0.425 [19]. In line with [17], our study showed that locus TPOX was the least polymorphic in all three population as 8, 7, and 8 alleles were noted in South, West, and combined Indian population, respectively.

4.2. Most and Least Common Alleles (MCA and LCA)

After an examination of the MCA as well as the LCA, it was discovered that the greatest number of alleles had been recorded in 11 of the 15 loci that had been investigated across all three populations. The allele 12 has the highest frequency throughout all three distinct populations, with a distribution that is as follows: 26.28% in the south, 27.02% in the west, and 21.03% in the combined Indian population. Similar findings were found by who conducted research on population genetic analysis for autosomal STR loci in the Sikh community of central India [20]. They gave a similar study, which found that allele 12 is the most prevalent allele in all three ethnicities.

4.3. Range of Alleles

Penta E was the only locus in which the maximum range was seen across all three populations, and the greatest numbers of alleles observed were 18 and 19 in the South and West Indian population, with the exception of locus FGA, which recorded the maximum of 23 alleles in the combined Indian population. Furthermore, at locus D7S820, all three populations shared a minimum range and number of alleles. Consistent with our findings, investigated the genetic makeup of 227 unrelated persons from Maharashtra and found that the highest number of alleles, 19, was located at the locus Penta E [17].

4.4. Statistical Parameters

Penta E had the lowest match probability and the highest discriminating capability across all three population types. In their study of 22 autosomal STRs in the Indian population, showed that Penta E has the highest value of power of discrimination [16]. Penta E has the highest observed and expected heterozygosity in south and combined Indian populations, whereas the highest expected and observed heterozygosity were recorded in the D18S51 and FGA locus, respectively, in the West Indian population. According to a summary by who studied the genetic diversity of Gorkhas, the most polymorphic and discriminatory locus in the population was FGA [21]. According to the results of the above analysis, the locus Penta E was the most variable of the studied loci, displaying the highest values of discrimination capacity (0.9659), expected heterozygosity (0.8988), observed heterozygosity (0.8857), polymorphic information content (0.8916), power of exclusion (0.7731), and paternity index (4.5763) and the lowest values of match probability and homozygosity. However, locus TPOX recorded the highest match probability (0.1805) and homozygosity (0.3489), and the lowest polymorphic information content (0.6074), expected heterozygosity (0.6499), observed heterozygosity (0.6510), power of exclusion (0.3675), discrimination capacity (0.8157), and paternity index (1.5058) which indicated the less informativeness of the TPOX locus in these populations.

This study may be considered for STR-based analysis of Indian people in the future since the frequency and forensic parameters values are consistent with earlier STR research on the Indian population. If and when the Indian government passes “The DNA Technologies (Use and Application) Regulation Bill – 2019,” these findings may serve as a foundation for further research and policy development. The key limitations from the previous studies were to collect the samples from different parts of the states in India and high expenses to analyze the more number of samples. The biggest challenge in this study was to convince the participants to provide the samples. In addition, the multiplex of PP16 system has a low chance of success rate for degraded DNA. The advantage of this study aside from being the new approach to compare the South and West Indian population with 15 STR it can also be used by the DNA analysts in South and West Indian laboratories for live forensic and relationship casework, as well as for further research outside. The statistical characteristics of allele frequency and forensic importance parameters derived from this work can be utilized for forensic identification and DNA relationship testing when analyzing live samples from South and West Indian population. This approach will undoubtedly help the existing forensic casework in South and West India employing allele frequency databases for STR markers. This effort will be supplemented by the development of a panel of genetic markers, specifically for the Indian population. Here is an example how the findings of this result can provide the accurate result for paternity case for example locus CSF1PO:

Mother: 11, 12

Child: 10, 11

Alleged Father: 9, 10

Matching allele with father: 10

Calculation of RMNE (Random Mating Not Excluded) = P2 + 2p (1-p), q= 1-p

Example

= (0.264)2 +2*0.264*0.736

=0.0697+0.3886 = 0.4583

RMNE in several populations

European African Spanish South Indian

CSF1PO*10 0.264 0.255 0.291 0.2177

This study results (E.g. South Indian) = (0.2177)2 +2 *0.2177*0.7823

= 0.04739+0.3406 = 0.3880

Similar finding can be established for all 15 STR when relationship analysis carried for South and West Indian population samples analyzed. This result can be defendable in court of law in any country if the person in question (this case alleged father) samples processed.

It is possible to infer that the findings of this study add to the present Indian DNA industrial frequency dataset, as well as provide insight into variances, similarities, and genetic distances among the South and West Indian population. The presented statistical parameters in this study corresponded with the findings of the previous studies conducted in different parts of India. The authors recommend the future researchers to cover the entire Indian population for sample collection analyzing with more STR loci such as PowerPlex (PP21) system which has been introduced after PP16. The future researchers can also focus on SNPs as it has an advantage over STRs which even in minute quantities of DNA can provide valuable information about individualization. The use of the software for the determination of various phenotypic features is recent and, hence, is an area with a wide research scope.


5. CONCLUSION

The highest range of allele frequency was noted at TPOX for all three populations. In South, West, and combined Indian population, the observed alleles were 46, 42, and 54, respectively. The corresponding total alleles were 157,159, and 182, respectively. The range of alleles was from 6 to 23 in all three populations. These three parameters showed the South and West Indian population had similar range of observed alleles, frequencies, and number of alleles. The locus Penta E was the most variable of the studied loci, displaying the highest values of discrimination capacity (0.9659), expected heterozygosity (0.8988), observed heterozygosity (0.8857), polymorphic information content (0.8916), power of exclusion (0.7731), and paternity index (4.5763) and the lowest values of match probability and homozygosity. However, locus TPOX recorded the highest match probability (0.1805) and homozygosity (0.3489), and the lowest polymorphic information content (0.6074), expected heterozygosity (0.6499), observed heterozygosity (0.6510), power of exclusion (0.3675), discrimination capacity (0.8157), and paternity index (1.5058) indicating the least variable locus in these population. The current data may serve as a suitable starting point for building the Indian population’s DNA database. The information gathered here may supplement existing databases from other STR-based studies of Indian people. These 15 STR loci are useful for personal identification due to their specificity and polymorphism. Thus, the authors advocate for further genetic or forensic examination of polymorphism in the Indian population substructure utilizing short tandem repeat [STR]. People from the states of Lakshadweep, Pondicherry, Kerala, Tamil Nadu, Telangana, Andhra Pradesh, Karnataka, Gujarat, Rajasthan, Maharashtra, and Goa should be considered a genetically representative sample of the South and West Indian population for the purposes of genetic analysis or the establishment of a DNA database for the entire Indian population.


6. AUTHORS’ CONTRIBUTIONS

All authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agreed to be accountable for all aspects of the work. All the authors are eligible to be an author as per the International Committee of Medical Journal Editors (ICMJE) requirements/guidelines.


7. FUNDING

There is no funding to report.


8. CONFLICTS OF INTEREST

The authors report no financial or any other conflicts of interest in this work.


9. ETHICAL APPROVALS

The ethical committee of SRM Institute of Science and Technology in Chennai, India has approved this study (1887/IEC/2020).


10. DATA AVAILABILITY

Data can be obtained from the corresponding author upon valid request.


11. PUBLISHER’S NOTE

This journal remains neutral with regard to jurisdictional claims in published institutional affiliation.

REFERENCES

1.  Jain S, Panigrahi I, Sheth J, Agarwal S. STR markers for detecting heterogeneity in Indian population. Mol Biol Rep 2011;39:461-5. [CrossRef]

2.  Singh KS. India's Communities People of India. National Series. Vol. 4. India:Oxford University Press;1998.

3.  Available from:https://en.wikipedia.org/wiki/south_india [Last accessed on 2023 Jan 25].

4.  Available from:https://en.wikipedia.org/wiki/western_india [Last accessed on 2023 Jan 13].

5.  Martin PD. National DNA database-practice and practicability. A forum for discussion. Int Congr Ser 2004;1261:1-8. [CrossRef]

6.  Lee YS, Kennedy WD, Yin YW. Structural insight into processive human mitochondrial DNA synthesis and disease-related polymerase mutations. Cell 2009;139:312-24. [CrossRef]

7.  Butler JM. Genetics and genomics of core short tandem repeat loci used in human identity testing. J Forensic Sci 2006;51:253-65. [CrossRef]

8.  Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborthy R. Evaluation of 13 short tandem repeat loci for use in personal identification applications. Am J Hum Genet 1994;55:175-89.

9.  Reilly P. Legal and public policy issues in DNA forensics. Nat Rev Genet 2001;2:313-7. [CrossRef]

10.  Available from:https://thewire.in/government/dna-technology-regulation-bill-seen-to-harm-minorities-hurt-privacy [Last accessed on 2023 Feb 02].

11.  Panneerchelvam S, Norazmi MN. Forensic DNA profiling and database. Malays J Med Sci 2003;10:20-6.

12.  Shi Y, Li X, Ju D, Li Y, Zhang X, Zhang Y. Genetic polymorphisms of short tandem repeat loci D13S305, D13S631 and D13S634 in the Han population of Tianjin, China. Exp Ther Med 2015;10:773-7. [CrossRef]

13.  D'Amato E, Ristow PG. Forensic statistics analysis toolbox (FORSTAT):A streamlined workflow for forensic statistics. In:Forensic Science International:Genetics Supplement Series. Vol. 6. Netherlands:Elsevier;2017. p. e52-4. [CrossRef]

14.  Projic P, Skaro V, Samija I, Pojskic N, Durmic-Pasic A, Kovscevic L, et al. Allele frequencies for 15 short tandem repeat loci in representative sample of Croatian population. Croat Med J 2007;48:473-7.

15.  Ghosh T, Kalpana D, Mukerjee S, Mukherjee M, Sharma AK, Nath S, et al. Genetic diversity of autosomal STRs in eleven populations of India. Forensic Sci Int Genet 2011;3:259-61. [CrossRef]

16.  Singh M, Nandineni MR. Population genetic analyses and evaluation of 22 autosomal STRs in Indian populations. Int J Leg Med 2017;131:971-3. [CrossRef]

17.  Badiye A, Kpoor N, Kumawat RK, Dixit S, Mishra A, Dixit A, et al. A study of genomic diversity in populations of Maharashtra, India, inferred from 20 autosomal STR markers. BMC Res Notes 2021;14:69. [CrossRef]

18.  Balamurugan K, Kanthimathi S, Vijaya M, Suhasini G, Duncan G, Tracey M, et al. Genetic variation of 15 autosomal microsatellite loci in a Tamil population from Tamil Nadu, Southern India. Leg Med (Tokyo) 2010;12:320-3. [CrossRef]

19.  Chauhan T, Kushwaha KP, Chauhan V. Genetic polymorphism of eleven STR loci in Rajput population of Delhi, India. Forensic Res Criminol Int J 2015;1:192-6. [CrossRef]

20.  Shrivastava P, Jain T, Ben Trivedi V. Genetic polymorphism study at 15 autosomal locus in central Indian population. Springerplus 2015;4:566. [CrossRef]

21.  Preet K, Malhotra S, Shrivastava P, Jain T, Rawat S, Varte LR, et al. Genetic diversity in Gorkhas:An autosomal STR study. Sci Rep 2016;6:32494. [CrossRef]

Reference

1. Jain S, Panigrahi I, Sheth J, Agarwal S. STR markers for detecting heterogeneity in Indian population. Mol Biol Rep 2011;39:461-5. https://doi.org/10.1007/s11033-011-0759-5

2. Singh KS. India’s Communities People of India. National Series. Vol. 4. India: Oxford University Press; 1998.

3. Available from: https://en.wikipedia.org/wiki/south_india [Last accessed on 2023 Jan 25].

4. Available from: https://en.wikipedia.org/wiki/western_india [Last accessed on 2023 Jan 13].

5. Martin PD. National DNA database-practice and practicability. A forum for discussion. Int Congr Ser 2004;1261:1-8. https://doi.org/10.1016/S0531-5131(03)01844-2

6. Lee YS, Kennedy WD, Yin YW. Structural insight into processive human mitochondrial DNA synthesis and disease-related polymerase mutations. Cell 2009;139:312-24. https://doi.org/10.1016/j.cell.2009.07.050

7. Butler JM. Genetics and genomics of core short tandem repeat loci used in human identity testing. J Forensic Sci 2006;51:253-65. https://doi.org/10.1111/j.1556-4029.2006.00046.x

8. Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborthy R. Evaluation of 13 short tandem repeat loci for use in personal identification applications. Am J Hum Genet 1994;55:175-89.

9. Reilly P. Legal and public policy issues in DNA forensics. Nat Rev Genet 2001;2:313-7. https://doi.org/10.1038/35066091

10. Available from: https://thewire.in/government/dna-technologyregulation-bill-seen-to-harm-minorities-hurt-privacy [Last accessed on 2023 Feb 02].

11. Panneerchelvam S, Norazmi MN. Forensic DNA profiling and database. Malays J Med Sci 2003;10:20-6.

12. Shi Y, Li X, Ju D, Li Y, Zhang X, Zhang Y. Genetic polymorphisms of short tandem repeat loci D13S305, D13S631 and D13S634 in the Han population of Tianjin, China. Exp Ther Med 2015;10:773-7. https://doi.org/10.3892/etm.2015.2560

13. D’Amato E, Ristow PG. Forensic statistics analysis toolbox (FORSTAT): A streamlined workflow for forensic statistics. In: Forensic Science International: Genetics Supplement Series. Vol. 6. Netherlands: Elsevier; 2017. p. e52-4. https://doi.org/10.1016/j.fsigss.2017.09.006

14. Projic P, Skaro V, Samija I, Pojskic N, Durmic-Pasic A,Kovscevic L, et al. Allele frequencies for 15 short tandem repeat loci in representative sample of Croatian population. Croat Med J 2007;48:473-7.

15. Ghosh T, Kalpana D, Mukerjee S, Mukherjee M, Sharma AK, Nath S, et al. Genetic diversity of autosomal STRs in eleven populations of India. Forensic Sci Int Genet 2011;3:259-61. https://doi.org/10.1016/j.fsigen.2010.01.005

16. Singh M, Nandineni MR. Population genetic analyses and evaluation of 22 autosomal STRs in Indian populations. Int J Leg Med 2017;131:971-3. https://doi.org/10.1007/s00414-016-1525-y

17. Badiye A, Kpoor N, Kumawat RK, Dixit S, Mishra A, Dixit A, et al. A study of genomic diversity in populations of Maharashtra, India, inferred from 20 autosomal STR markers. BMC Res Notes 2021;14:69. https://doi.org/10.1186/s13104-021-05485-z

18. Balamurugan K, Kanthimathi S, Vijaya M, Suhasini G, Duncan G, Tracey M, et al. Genetic variation of 15 autosomal microsatellite loci in a Tamil population from Tamil Nadu, Southern India. Leg Med (Tokyo) 2010;12:320-3. https://doi.org/10.1016/j.legalmed.2010.07.004

19. Chauhan T, Kushwaha KP, Chauhan V. Genetic polymorphism of eleven STR loci in Rajput population of Delhi, India. Forensic Res Criminol Int J 2015;1:192-6. https://doi.org/10.15406/frcij.2015.01.00031

20. Shrivastava P, Jain T, Ben Trivedi V. Genetic polymorphism study at 15 autosomal locus in central Indian population. Springerplus 2015;4:566. https://doi.org/10.1186/s40064-015-1364-1

21. Preet K, Malhotra S, Shrivastava P, Jain T, Rawat S, Varte LR, et al. Genetic diversity in Gorkhas: An autosomal STR study. Sci Rep 2016;6:32494. https://doi.org/10.1038/srep32494

Article Metrics
138 Views 57 Downloads 195 Total

Year

Month

Related Search

By author names

Similar Articles