Functional analyses of the horizontally acquired Phytophaga glycoside hydrolase family 45 (GH45) proteins reveal distinct functional characteristics

Cellulose, a major polysaccharide of the plant cell wall, consists of β-1,4-linked glucose moieties forming a molecular network recalcitrant to enzymatic breakdown. Although cellulose is potentially a rich source of energy, the ability to degrade it is rare in animals and was believed to be present only in cellulolytic microbes. Recently, it has become clear that some animals encode endogenous cellulases belonging to several glycoside hydrolase families (GHs), including GH45. GH45s are distributed patchily among the Metazoa and, in insects, are encoded only by the genomes of Phytophaga beetles. This study aims to understand both the enzymatic properties and the evolutionary history of GH45s in these beetles. To this end, we tested the enzymatic abilities of 37 GH45s derived from five species of Phytophaga beetles and learned that beetle-derived GH45s degrade three different substrates: amorphous cellulose, xyloglucan and glucomannan. Our phylogenetic and gene structure analyses indicate that at least one gene encoding a putative cellulolytic GH45 was present in the last common ancestor of the Phytophaga, and that GH45 xyloglucanases evolved several times independently in these beetles. The most closely related clade to Phytophaga GH45s contained fungal sequences, suggesting this GH family was acquired by horizontal gene transfer from fungi. Other than in insects, arthropod GH45s do not share a common origin and appear to have emerged at least three times independently.

Introduction 8 8 carboxymethyl cellulose (CMC) (Fig. 1B). Activity halos were visible for at least two GH45s per 149 target species. The intensity of the observed activity halos varied from large clearing zones (for 150 example, PCO4 or LDE2) to small or medium ones (such as PCO3 or LDE7). These differences 151 were likely due to the catalytic efficiency of each individual GH45 as well as to the 152 concentration of the crude protein extracts we used. Each clearing zone, independent of its 153 intensity and size, indicated endo-β-1,4-glucanase activity. 154 To further assess enzymatic characteristics of these GH45s, we performed assays with a variety 155 of plant cell wall-derived polysaccharides as substrates and analyzed the resulting breakdown 156 products on thin layer chromatography (TLC) ( Table 1; Fig. S1 to S5). We were able to confirm GH45s (Davies, et al. 1995). In addition, we investigated three other sites crucial for enzymatic 194 activity: (i) a proposed stabilizing aspartate (Asp114), (ii) a conserved alanine (Ala74) and (iii) a 195 highly conserved tyrosine (Tyr8) (Fig. 2). Apart from two substitution events of Tyr8 in LDE9 196 and LDE3, this amino acid remained conserved in all other beetle GH45 sequences. LDE9 197 already possessed a mutation in its catalytic acid, which was likely responsible for the lack of 198 activity. In LDE3, a substitution from Tyr8 to Phe8 did not significantly impact the catalytic 199 abilities of this protein, likely because the side-chains of both amino acids are highly similar and 200 differ only in a single hydroxyl group. When examining the proposed stabilizing site Asp114, we 201 observed several amino acid substitutions that correlated with a loss of activity in PCO8, DVI3, 202 DVI4, LDE1 and LDE8. Amino acid changes at the Asp114 position were also observed in 203 PCO3 and CTR1, but were not correlated with a loss of enzymatic activity. Since LDE4 and 204 DVI7 appeared to have no mutation in Asp114, we screened the Ala74 residue for substitutions; 205 the amino acid exchange we observed, from alanine to glycine in both cases, may have caused 206 the loss of activity in these two proteins. Altogether, amino acid substitutions at important sites 207 could be detected in some apparently inactive GH45s, but not in all of them. It may be that the 208 proteins for which we did not find amino acid substitutions are still active enzymes, and we have 209 just not yet found the right substrate; alternatively, we have not yet checked all the amino acid 210 positions, some of which could also be crucial for catalysis. 211 Interestingly, all Chrysomelid GH45 xyloglucanases (except LDE5 and DVI6), including G. 212 viridula GVI1 from our previous study, (Busch, et al. 2018), displayed a substitution from 213 aspartate to glutamate at the stabilizing site (Asp114) (Fig. 2). As glutamate differs from 214 aspartate only by an additional methyl group within its side chain, we believe that this exchange derived GH45 xyloglucanases, we found that GH45 xyloglucanases from the Curculionidae S. 217 oryzae (SOR3 to SOR5) had a substitution from aspartate to glutamate in the proton donor 218 residue (Asp121). We also believe that, in S. oryzae, this particular substitution may have 219 contributed to the preference for xyloglucan over cellulose as a substrate. 220 In summary, we demonstrated that each species investigated encoded at least two cellulolytic 221 GH45s that are able to degrade amorphous cellulose. We also demonstrated that at least one 222 GH45 per species possessed the ability to degrade only xyloglucan. Interestingly, several GH45s 223 did not show activity on any of the substrates we tested, suggesting that they have become 224 pseudo-enzymes or are active on substrates not tested here. 225 Phylogenetic analyses reveal multiple origins of GH45 genes during the evolution of 226 Metazoa 227 For further insight into the evolutionary history of beetle-derived GH45 genes, we used 228 phylogenetic analyses to reconstruct their evolutionary history. To achieve this goal, we 229 collected amino acid sequences of GH45s available as of February 2018, including those from 230 the CAZy database (http://www.cazy.org) (Lombard, et al. 2014) as well as from several 231 transcriptome datasets accessible at NCBI Genbank. Interestingly, we realized that the presence 232 of GH45 genes in arthropods was not restricted to Phytophaga beetles: these genes were also 233 distributed in transcriptomes/genomes of species of springtails (Hexapoda: Collembola) and of 234 species of Oribatida mites (Arthropoda: Chelicerata) (Table S3). In addition, we identified GH45 due to either the presence of GH45 genes in a common ancestor, followed by multiple gene 240 losses, or from multiple independent acquisitions from foreign sources (i.e., HGT). To test these 241 hypotheses, we collected a diverse set of GH45 sequences of microbial and metazoan origins 242 resulting in 264 sequences (Table S3). Subsequently, redundancy at 90% identity level between 243 sequences was eliminated, resulting in 201 non-redundant GH45 sequences. According to both 244 Bayesian and maximum likelihood phylogenies (Fig. 3, S6 andS7), the arthropod-derived GH45 245 sequences were not monophyletic but globally fell into three separate groups. One highly subgroups. In addition to the arthropods, the nematode GH45 sequences formed a highly 255 supported monophyletic clade (PP= 1.0, bootstrap=84) which was connected to a clade of 256 fungal-derived sequences. This connection was highly supported (PP= 0.93,bootstrap=69). The 257 two other groups of Metazoa (tardigrades and rotifers) were located in a separate clade with 258 species of Neocallimastigaceae fungi (Chytrids), (PP=0.94,bootstrap=14). Overall, this analysis 259 showed that neither arthropods nor, more generally, metazoan GH45 sequences, originated from 260 a common ancestor, as they were scattered in multiple separate clades rather than forming a 261 monophyletic metazoan clade.
In summary, our phylogenetic analyses illustrated that the evolutionary history of GH45s in the 263 Metazoa was complex and pointed to the possibility that this gene family evolved several times 264 independently in multicellular organisms. More specifically, our analyses suggested that this 265 gene family had evolved at least three times independently in arthropods. Finally, our data 266 pointed toward an acquisition of GH45 genes by the LCA of Phytophaga beetles --presumably 267 through an HGT event --from a fungal donor.

268
The structure of GH45 genes in Phytophaga beetles supports a single origin before the split 269 of the Chrysomeloidea and Curculionoidea

270
The monophyly of the Phytophaga-derived GH45s in the above phylogenetic analyses suggests a 271 common ancestral origin in this clade of beetles. If the presence of a GH45 in the Phytophaga 272 beetles had resulted from a single acquisition in their LCA, we hypothesized that the GH45 273 genes present in current species of leaf beetles, longhorned beetles and weevils would share a 274 common exon/intron structure. To test this hypothesis, we mined the publicly available genomes 275 of three species of Curculionidae, including H. hampei (Vega, et al. 2015), D. ponderosae 276 (Keeling, et al. 2013) and S. oryzae (unpublished), as well as the genomes of the Chrysomelidae 277 L. decemlineata (Schoville, et al. 2018) and of the Cerambycidae A. glabripennis (McKenna, et 278 al. 2016). We were able to retrieve the genomic sequence corresponding to each of the GH45 279 genes present in these beetle species, with the exception of DPO9, which we did not find at all, 280 and SOR3 and SOR4, which we were able to retrieve only as partial genomic sequences. Our 281 results showed that the number of introns varied between the different species (Fig. S8). In L. 282 decemlineata (representing Chrysomelidae), we identified a single intron in each of the GH45 283 genes (except for LDE11, which had two). For A. glabripennis (representing Cerambycidae), we 284 found two introns in each of the two GH45 genes. In H. hampei, D. ponderosae and S. oryzae (all representing Curculionidae), the number of introns ranged from three to five. Interestingly, 286 all GH45 genes in these five species possessed an intron placed within the part of the sequence 287 encoding the predicted signal peptide. Apart from DPO7 and DPO8, these introns were all in 288 phase one. This gene structure of Chrysomelid-and Curculionid-derived GH45 genes correlated 289 well with our previous study investigating the gene structure of PCW-degrading enzymes, 290 including GH45 genes, in the leaf beetle Chrysomela tremula (Pauchet, Saski, et al. 2014). The 291 conservation of the phase and the position of this intron indicated that the LCA of the 292 Phytophaga likely possessed a single GH45 gene having a phase one intron located in a part of 293 the sequence encoding a putative signal peptide. To assess whether that particular intron is also 294 present in the most closely related fungal species, we blasted the genomes of 295 Saccharomycetaceae and Neocallimastigaceae fungi (NCBI, whole-shotgun genome database) 296 using the protein sequence of GH45-1 of C. tremula. We did not detect any introns in fungal 297 GH45 sequences (data not shown), suggesting that the proposed intron was acquired after the 298 putative HGT event. The diversity of the overall intron-exon structure in phytophagous beetles 299 likely resulted from subsequent and independent intron acquisition. In summary, and together 300 with the monophyly of beetle-derived GH45s (Fig. 3), our analysis highly supports a common 301 ancestral origin of beetle GH45.

302
Evolution of the GH45 family after the initial split of Chrysomeloidea and Curculionoidea 303 We mined publicly available transcriptome and genome datasets of Phytophaga beetles (Table   304 S4) and collected as many GH45 sequences as possible. We curated a total of 266 GH45 305 sequences belonging to 42 species of Phytophaga beetles. After amino acid alignment, we 306 decided to exclude 60 partial GH45 sequences from our phylogenetic analysis because these 307 were too short. We performed a "whole Phytophaga" phylogenetic analysis on the remaining 206 curated GH45 sequences using maximum likelihood (Fig. 4). Because most of the deeper nodes 309 were poorly supported, we decided to collapse branches having a bootstrap support below 50.

310
Our phylogenetic analysis indicated that no orthologous genes were found between species of 311 Chrysomeloidea and Curculionoidea (Fig. 4). The only exception to this rule was found in a 312 clade containing the xyloglucanases from the Chrysomelidae (clade m) and from the 313 Curculionidae (clade n), which cluster together with a bootstrap support of 73 Proceeding 314 cautiously, because the substrate switch from amorphous cellulose to xyloglucan seemed to be 315 due to different amino acid substitutions at catalytically important sites between Chrysomelidae-316 derived xyloglucanases and Curculionidae-derived ones (Fig. 2), we hypothesize a single 317 common ancestry of cellulolytic GH45s; in contrast, xyloglucanase activity likely arose through 318 convergent evolution at list twice within the Phytophaga clade of beetles.

319
Clade n comprised Brentidae-and Curculionidae-derived GH45s including SOR3 to SOR5 (Fig.   320 4). According to our functional data, SOR3 to SOR5 act as xyloglucanases, suggesting that other 321 GH45 proteins present in clade n may have also evolved to degrade xyloglucan. To support this 322 hypothesis, we compared the catalytic residues of SOR3-5 to those of the other Curculionidae 323 and Brentidae-derived GH45 sequences from clade n (Fig. S9). We detected substitutions from 324 an aspartate to a glutamate at Asp121 in all Curculionidae-derived GH45 sequences of this clade 325 but not in the Brentidae-derived sequences, suggesting that Curculionidae-derived GH45s of this 326 clade were likely to possess xyloglucanase activity. Functional analyses of the Brentidae-derived 327 sequences present in clade n will be needed to determine whether these proteins are also 328 xyloglucanases or whether they fulfill another function.

329
The second major cluster encompassed GH45s of clade l, with a bootstrap support of 95, and 330 contained only Curculionidae-derived sequences (Fig. 4). Within this clade, we found SOR1 and SOR2, which are, according to our functional data, endo-active cellulases. Their presence in 332 clade l implies that other GH45s of this cluster exhibit potential endo-cellulolytic activity. To 333 further support this hypothesis, we again investigated amino acid residues of the catalytic site by 334 comparing SOR1 and SOR2 to other GH45 sequences in this clade (Fig. S9). We did not find 335 crucial substitutions in any of the investigated sites, implying that all GH45 proteins of this clade 336 may have retained endo-β-1,4-glucanase activity. GH45 sequences present in the other 337 Curculionidae and Brentidae-specific clades did not harbor any amino acid substitutions which 338 could impair their catalytic properties, and they may all possess the ability to break down 339 amorphous cellulose. More functional analyses will be necessary to assess the function of these 340 proteins.

341
Regarding Chrysomeloidea-derived sequences, a highly supported clade ( remaining Galerucinae-derived GH45s, such as those from the Alticines Phyllotreta armoraciae 346 and Psylliodes chrysocephala, present in clade m have yet to be functionally characterized.

347
When the catalytic residues of active xyloglucanases from our study were compared to the 348 uncharacterized GH45 sequences present in clade m, we observed that at least PAR6 and PCH5 349 of the Galerucinae and OCA10 and CPO2 of the Chrysomelinae had congruent substitutions 350 (ASP114 > Glu114), which likely enabled those proteins to also degrade xyloglucan (Fig. S10).

351
Therefore, it is highly likely that the LCA of the Chrysomelinae and the Galerucinae possessed at 352 least two GH45 proteins, an endo-acting cellulase and a xyloglucanase.  In summary, our focus on Phytophaga-derived GH45s with regards to enzymatic characterization 362 and ancestral origin allowed us to postulate that at least one GH45 protein was present in the 363 LCA of the Phytophaga beetles and that this GH45 protein likely possessed cellulolytic activity.

364
After the split between Chrysomeloidea and Curculionoidea, the GH45 gene family evolved 365 through gene duplications at the family, subfamily and even genus/species level. Finally, 366 according to our data, the ability of these beetles to break down xyloglucan, one of the major 367 components of the primary plant cell wall, happened at least twice, once in the LCA of the 368 Chrysomelinae and Galerucinae and once in the LCA of the Curculionidae.

370
In our previous research, we found that several beetles of the Phytophaga encoded a diverse set 371 of GH45 putative cellulases (Pauchet, et al. 2010). Here we demonstrated that in each of the five 372 Phytophaga beetles investigated, at least two of these GH45s possess cellulolytic activity. This 373 discovery is in accordance with other previously described GH45 proteins from Insecta (Pauchet,  (Szydlowski, et al. 2015) and microbes (Mcgavin and Forsberg 1988).
Surprisingly, several GH45 proteins were able to degrade glucomannan in addition to cellulose. 377 We hypothesize that GH45 bi-functionalization may have occurred as a result of the chemical 378 similarities between cellulose and glucomannan. Glucomannan is a straight chain polymer Callosobruchus maculatus (GH5 subfamily 10 or GH5_10) (Pauchet, et al. 2010;Busch, et al. 384 2017), and one GH5_8 has been characterized in the coffee berry borer H. hampei (Acuna,et al. 385 2012). But in contrast to the activity on glucomannan of some GH45s we observed here, those 386 GH5_10s and GH5_8 were true mannanases, displaying activity towards galactomannan as well 387 as glucomannan. Although our experiments suggested some GH45 cellulases were also active on 388 glucomannan, we believe that the activity these proteins carry out could be important for the 389 degradation of the PCW in the beetle gut. In fact, mannans, including glucomannan, can make up 390 to 5 % of the plant primary cell wall (Scheller and Ulvskov 2010) and may be a crucial 391 enzymatic target during PCW degradation. This hypothesis is further supported by the presence 392 of at least one GH45 protein with some ability to degrade glucomannan in each of the 393 Chrysomelid beetles for which we have functional data.

394
Another interesting discovery was that several GH45 proteins have lost their ability to use 395 amorphous cellulose as a substrate and evolved instead to degrade xyloglucan, the major 396 hemicellulose of the plant primary cell wall (Pauly, et al. 2013). We believe that the initial 397 substrate shift from cellulose to xyloglucan has likely been promoted by similarities between the 398 substrate backbones (in both cases β-1,4 linked glucose units). The major difference between cellulose and xyloglucan is that the backbone of the latter is decorated with xylose units (which 400 in turn can be substituted by galactose and/or fucose). We presume that the substrate shift from a 401 straight chain polysaccharide such as cellulose to a more complex one such as xyloglucan 402 requires the similar complex adaptation of the enzyme to its novel substrate. However, in 403 contrast to glucomannan-degrading GH45s, GH45 xyloglucanases have apparently completely 404 lost their ability to use amorphous cellulose as a substrate. Here, we clearly demonstrated that, 405 following several rounds of duplications, GH45s in Chrysomelid beetles have evolved novel 406 functions in addition to their ability to break down amorphous cellulose, allowing these insects to 407 degrade two additional major components of the PCW, namely. glucomannan and xyloglucan.

408
This broadening of their functions further emphasizes that GH45 proteins may have likely been 409 an important innovation during the evolution of the Phytophaga beetles and may have strongly 410 contributed to their radiation. In summary, the ability of GH45 proteins to degrade a variety of 411 substrates either as monospecific or as bi-functionalized enzymes indicates that these proteins are 412 particularly prone to substrate shifts.

413
According to our data, the ability to break down xyloglucan using a GH45 protein has evolved at 414 least twice independently in Phytophaga beetles, once in the LCA of Chrysomelinae and 415 Galerucinae and once in the LCA of the Curculionidae or of the Curculionidae and Brentidae.

416
Once the first Brentidae-derived GH45s are functionally characterized, we will know more.

417
Given that genome/transcriptome data for a majority of families and subfamilies are lacking 418 throughout the Phytophaga clade, we expect that other examples of independent evolution of 419 GH45 xyloglucanases will be revealed in the future. It is important to note that the ability to 420 degrade xyloglucan, which represents an important evolutionary innovation for Phytophaga 421 beetles, may not be linked solely to the evolution of the GH45 family. In fact, in A. glabripennis (Cerambycidae: Lamiinae), a glycoside hydrolase family 5 subfamily 2 (GH5_2) protein has 423 evolved to degrade xyloglucan; additionally, orthologous sequences of this GH5_2 424 xyloglucanase have been found in other species of Lamiinae (McKenna, et al. 2016).

425
The ability of GH45s to break down xyloglucan correlated with a substitution event from an 426 aspartate to a glutamate residue at a stabilizing site (Asp114) within the Chrysomelidae.

427
Interestingly, the same amino acid exchange was present in SOR3-SO5 but was located at the 428 catalytic acid (Asp121) rather than the stabilizing site (Asp114). Aspartate and glutamate share 429 the same functional group but differ in the length of their side chain. Thus, the preservation of 430 the functional group coupled with an elongated side chain has likely contributed to the substrate 431 switch of those GH45 proteins which when turned on allows xyloglucan to be degraded.

432
Notably, DVI6 and LDE5 do not share that particular substitution but are able to degrade 433 xyloglucan. Therefore, we believe that the transition from cellulase to xyloglucanase has not 434 been driven solely by a single amino acid substitution, but has been triggered by changes at other xyloglucan by using GH5_2, we believe that a substrate shift from cellulose to xyloglucan of a GH45 never evolved in this family of beetles, not that is was lost. Second, as described above, 446 there are distinct substitution events in the catalytic site between proteins from species of the two 447 superfamilies that likely led to their substrate shift. Based on these facts, we suggest that GH45 448 xyloglucanases have evolved convergently in both superfamilies. In contrast to GH45 449 xyloglucanases, GH45 cellulases were present in species of each Phytophaga family investigated 450 to date. This strongly suggests that a cellulolytic GH45 was present in the ancestral Phytophaga including beetles other than Phytophaga, and all publicly available insect genomes, as well as 460 publicly available genomes of Collembola and Oribatida mites. Except for the latter two, we 461 were unable to retrieve GH45 sequences from arthropods. Surprisingly, our phylogenetic 462 analyses clearly showed that the arthropod-derived GH45s, rather than clustering together, 463 formed three separate monophyletic groups. In fact, all metazoan-derived GH45s clustered 464 separately, forming independent monophyletic groups. The patchy distribution of GH45 465 sequences among Metazoa indicates either that these proteins were acquired multiple times 466 throughout animal evolution or that massive differential gene loss occurred within multicellular 467 organisms. The latter hypothesis appears to be less parsimonious as it implies the existence of multiple GH45s in the LCA of Ophistokonta (Fungi and Metazoa) followed by reciprocal 469 differential gene losses and multiple independent total gene losses in many animal lineages. 470 Intriguingly, the closest clade to the Phytophaga GH45 sequences contained fungal-derived 471 sequences. Our phylogenetic analyses could not identify a specific donor species/group but both 472 suggested species of Saccharomycetales or Neocallimastigaceae fungi as potential source. The 473 most parsimonious explanation for the appearance of GH45 genes in Phytophaga beetles is that 474 one or more genes was acquired by horizontal gene transfer (HGT) from a fungal donor. A 475 similar scenario may have been responsible for the presence of GH45 genes in Oribatida and 476 Collembola, but this hypothesis remains speculative until more sequences from both these orders 477 are identified. In addition to the monophyly of Phytophaga-derived GH45 sequences, a common 478 origin was further suggested by the fact that the position and the phase of the first intron was 479 (except for two cases) conserved across GH45 genes from the species of Cerambycidae,  Strikingly, nematode-derived GH45 sequences were consistently grouped together with 488 Saccharomycetales fungi in each analysis we ran, clearly demonstrating that the closest relatives 489 to their GH45 genes were fungal and from different fungi than the insect relatives. The origin of 490 nematode-derived GH45 genes has been investigated, and their acquisition by HGT from a fungal source has been proposed (Kikuchi, et al. 2004;Palomares-Rius, et al. 2014). Here we 492 provide the third independent confirmation of this fact.

493
In conclusion, our research indicated that the Phytophaga GH45s have adapted to substrate 494 shifts. In addition to cellulose, this adaptation led to the recognition and catalysis of two Open reading frames (ORFs) were amplified from cDNAs using gene-specific primers based on 509 previously described GH45 sequences of C. tremula, P. cochleariae, L. decemlineata, D. 510 virgifera virgifera and S. oryzae (Pauchet, et al. 2010). If necessary, full-length transcript 511 sequences were obtained by rapid amplification of cDNA ends PCR (RACE-PCR) using RACE-512 ready cDNAs as described by (Pauchet, et al. 2010). For downstream heterologous expression, reverse primer designed to omit the stop codon. cDNAs initially generated for the (RACE-PCR), 515 as described by (Pauchet, et al. 2010), were used as PCR template, and the PCR reactions were 516 performed using a high-fidelity Taq polymerase (AccuPrime, Invitrogen). The PCR products 517 were cloned into the pIB/V5-His TOPO (Invitrogen) in frame with a V5-(His) 6 epitope. TOP10 518 chemically competent Escherichia coli cells (Invitrogen) were transformed and incubated 519 overnight on a LB agar plate containing ampicillin (100 µg/ml). To select constructs for which 520 the recombinant DNA had ligated in the correct orientation, randomly picked colonies were 521 checked by colony PCR using the OpIE2 forward primer (located on the vector) and a gene-522 specific reverse primer (Table S5). Positive clones were further cultured in 2x yeast extract 523 tryptone (2xYT) medium containing 100 µg/ml ampicillin. After plasmid isolation using after transfection, the culture medium was harvested and centrifuged (16,000 x g, 5 min, 4 °C) to 530 remove cell debris. Successful expression was verified by Western blot using the anti-V5-HRP 531 antibody (Invitrogen). In order to collect enough material for downstream enzymatic activity 532 assays, a single clone per construct was chosen to be transiently transfected in a 6-well plate 533 format. 72 h after transfection, culture medium was harvested and treated as described above.

534
The cell medium was stored at 4 °C until further use.

Enzymatic characterization
The enzymatic activity of recombinant proteins was initially tested on agarose diffusion assays 537 using carboxymethyl cellulose (CMC) as a substrate. Agarose (1%) plates were prepared, 538 containing 0.1 % CMC in 20 mM citrate/phosphate buffer pH 5.0. Small holes were made in the 539 agarose matrix using cut-off pipette tips, to which 10 µl of the crude culture medium of each 540 expressed enzyme was applied. After incubation for 16 h at 40 °C, activity was revealed by 541 incubating the agarose plate in 0.1% Congo red for 1 h at room temperature followed by washing Sequences corresponding to Phytophaga GH45 proteins described in our previous studies were 586 combined with those mined from several NCBI databases, such as the non-redundant protein 587 database (ncbi_nr) and the transcriptome shotgun assembly database (ncbi_tsa) ( Table S4). In 588 addition, transcriptome datasets generated from species of Phytophaga beetles were retrieved 589 from the short-read archive (ncbi_sra) ( Table S4) and assembled using the CLC workbench 590 program version 11.0. Reads were loaded and quality trimmed before being assembled using 591 standard parameters. The resulting assemblies were screened for contigs matching known beetle 592 GH45 sequences through BLAST searches. The resulting contigs were then manually curated 593 and used for further analysis. Amino acid alignments were carried out using MUSCLE version Goldman' (WAG) model, incorporating a discrete gamma distribution (shape parameter = 5) to 598 model evolutionary rate differences among sites (+G) and a proportion of invariable sites (+I).

599
The robustness of the analysis was tested using 1,000 bootstrap replicates.

600
Large phylogenetic analysis 601 We used the GH45 protein sequence from Sitophilus oryzae (ADU33247.1) as a BLASTp query 602 against the NCBI's non-redundant protein library with an E-value threshold of 1E -3 . We retrieved 603 the 250 best blast hits (Table S3), encompassing a majority of fungal sequences as well as various hexapod sequences (including Chrysomelidae, Curculionidae, Lamiinae and Collembola 605 (=Entomobryomorpha)). Besides fungi and hexapods, GH45 sequences from 10 nematodes, from 606 one Tardigrade, one Rotifer and one bacterium, as well as a few uncharacterized protists from 607 environmental samples, were among the 250 best BLAST hits. We complemented this dataset 608 with predicted proteins from several Oribatid mites (10 sequences) and Collembola (4 609 sequences) retrieved from ncbi_tsa (Table S3). This resulted in a collection of 264 sequences.    Crude culture medium of transfected cells was applied to an agarose-diffusion assay containing 817 0.1 % CMC. Activity halos were revealed after 16 h incubation at 40 °C using Congo red.

818
Numbers above Western blot and agarose-diffusion assays correspond to the respective species 819 of GH45s depicted in supplementary Table S1.