-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathindex.qmd
1111 lines (808 loc) · 34.5 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
```{r}
#| label: setup
library(dplyr)
library(tidyr)
library(ggplot2)
library(stringdist)
library(ggdist)
library(patchwork)
library(here)
library(tidybayes)
# source("R/ambla.R")
source("R/utils.R")
# set ggplot theme and colour palette
my_theme <- theme_ambla() +
theme(
panel.grid = element_blank(),
axis.line = element_line(colour = "black"),
text = element_text(size = 12, colour = "black"),
axis.text = element_text(colour = "black")
)
theme_set(my_theme)
clrs <- c("#AA0E28", "#ED1D3F", "#F47B8F", "#70B3FF", "#004CA3", "#002147")
options(
ggplot2.ordinal.fill = clrs[c(1, 4, 5)],
ggplot2.ordinal.colour = clrs[c(1, 4, 5)],
ggplot2.discrete.fill = clrs[c(1, 4, 5)],
ggplot2.discrete.colour = clrs[c(1, 4, 5)],
ggplot2.continuous.fill = ggplot2::scale_color_gradient,
ggplot2.continuous.colour = ggplot2::scale_color_gradient
)
set.seed(1234)
```
## The initial lexicon
Average 20-year-old knows ~42,000 lemmas: **mental lexicon**
. . .
::: box
Lexical representations
: Phonological, conceptual, grammatical information of known words
:::
. . .
**First lexical representations at 6-9 months**
* Inter-modal experimental paradigms [@bergelson2012months; @tincoff1999beginnings]
* Parental reports and surveys [e.g., CDI, @fenson1994variability]
::: {.notes}
* The average English 20-year-old knows around 42,000 words
* The collection of words that an adult knows is known as the *mental lexicon*
* The *mental lexicon* is formed by *lexical representations*, each embedding phonological, conceptual, and grammatical information about known words
* In the last five years, we have explored the initial lexicon through the lens of a particular case of language learning: bilingual infants
* The foundations of a lexicon are laid by the first form-meaning mappings at 6 months
* Two sources of evidence: inter-modal experimental paradigms and parental reports
:::
## Normative trajectories of lexical development
Vocabulary size norms for 51,800 monolingual children learning 35 distinct languages [@frank2017wordbank]

::: notes
* Data from Worbank, a database that collects CDI responses from thousands of children learning a viriety of languages
* Here I show normative data for a large sample of CDI administrations
* Until 18 months, vocabulary size grows steadily from 0 to approx. 100 words
* From 18 months on, infants acquire new words at an increasingly faster rate, and know on average around 400 words
* Infants' second year of life is critical for lexical development
* This is the age range we will focus on in the present dissertation
:::
## Bilinguals face additional challenges, but do not lag behind
<br>
<br>
::: columns
:::: {.column width="33%"}
::::: {.fragment fragment-index=1}
:::::: box
Increased **complexity** in linguistic context
::::::
:::::
::::
:::: {.column width="33%"}
::::: {.fragment fragment-index=2}
:::::: box
Reduced linguistic **input** in each language
::::::
:::::
::::
:::: {.column width="33%"}
::::: {.fragment fragment-index=3}
:::::: box
Increased **referential ambiguity**
::::::
:::::
::::
:::
::: columns
:::: {.column width="33%"}
::::: {.fragment fragment-index=1}
Two overlapping codes
:::::
::::
:::: {.column width="33%"}
::::: {.fragment fragment-index=2}
Split into two languages
:::::
::::
:::: {.column width="33%"}
::::: {.fragment fragment-index=3}
\> 2 labels per referent
:::::
::::
:::
::: notes
- There are reasons to think that, a priori, bilingual infants should lag behind their monolingual peers in lexical development
- I highlight three reasons.
- First, bilinguals face a more complex linguistic environment, as they have two learn two partially overlapping codes instead of one (two grammar systems, two sets of words, two phoneme inventories, etc.)
- Second, they do so while facing a reduced linguistic input, relative to monolinguals. Because their input is split into two languages, they are exposed to a lower degree to each language than monolinguals.
- Third, bilinguals face increased referential ambiguity. They not only need to learn word-referent associations like monolinguals, they often have to associate more than one label per referent, one in each language
:::
## Bilinguals face additional challenges, but do not lag behind
::: box
**Hoff et al. (2012)**: bilinguals acquire words at similar rates as monolinguals
:::
::: columns
:::: {.column}
::::: fragment
{width="85%"}
:::::
::::
:::: {.column}
::::: fragment
{width="90%"}
:::::
::::
:::
::: notes
- Despite these challenges, bilinguals reach most of their language acquisition milestones at equivalent ages as monolinguals
- This is specially true for lexical development
- For instance, I highlight this study by Hoff et al. (2012), in which they collected English and Spanish CDI administrations for a cohort of English monolinguals and English/Spanish bilingual children living in Florida.
- Both groups were matched across many demographic variables, like SES
- When the authors compared monolingual and bilingual vocabulary sizes in each language separately, they observed that English monolinguals knew more English words on average than bilinguals.
- But when English and Spanish vocabulary sizes were combine into a measure of total vocabulary size, monolinguals and bilinguals' trajectories of vocabulary growth were equivalent.
- This begs the question: *How do bilinguals keep up with monolinguals despite facing a more challenging lingusitic environment*?
:::
## Lexical similarity modulates vocabulary growth in bilinguals
::: columns
:::: {.column}
::::: box
**Floccia et al. (2018)**: CDI responses of 372 bilinguals learning **English + additional language**
:::::
::::
:::: {.column}
::::: fragment
**Lexical similarity**: Average phonological similarity (Levenshtein) between pairs of translations
:::::
::::
:::
<br>
. . .
**Higher lexical similarity, larger vocabulary size**
Stronger effect in the additional language (e.g., Dutch, Mandarin)
::: notes
- Recent studies have suggested that bilinguals might exploit the lexical similarity between their languages to boost the vocabulary growth.
- In an influential monograph published in 2019, Floccia et al. collected vocabulary data from a large sample of bilingual 2-year-olds in the UK.
- These children were learning English and an additional language, out of a diverse sample of 13 other languages from different typological families, linguistic groups, or that followed different grammatical systems.
- Most children were English-dominant, but were exposed to the additional language in verying degrees
- The authors calculated a measure of lexical similarity for each pair of languages.
- For each language pair, they calculated the average phonological similarity between the translation equivalents included in the English vocabulary checklists and in each of the other additional languages
- This measure was included as a predictor in a model that estimated participants' vocabulary size in each of their languages
- This index of lexical similarity turned out to be positively related with children's vocabulary size in their additional language
- For instance, English-Dutch bilinguals (learning two lexically similar languages) knew more Dutch words than Enlgish-Mandaring (low similarity languages) knew in Mandarin
- Overall, these results pointed to lexical similarity as a possible facilitator of word acquisition
:::
## Lexical similarity modulates vocabulary acquisition in bilinguals

::: notes
- Such a facilitation effect would have important consequences for some populations of bilinguals.
- For instance, most infants in Catalunya learn Catalan and Spanish simulataneously.
- The languages share a high lexical similarity.
- In fact when using the method to calculate lexical similarity than in Floccia et al., we find that Catlaan and Spanish share around double as much similarity than English and Dutch, the pair of languages with the highest similarity included in the study
:::
## A cognate facilitation in lexical acquisition?
::: columns
:::: {.column}
**Cognates**: Phonologically-similar translation equivalents
| *Cognate* | *Non-cognate* |
|:-------------------:|:-------------------:|
| [cat] /ˈgat-ˈga.to/ | [dog] /ˈgos-ˈpe.ro/ |
::::
:::: {.column}
::::: fragment
Some evidence that cognates **acquired earlier** than non-cognates [@mitchell2023cognates; @bosch2014first; @tan2024role; @siow2022effect]
:::::
::::
:::
<br>
. . .
::: box
**What *mechanisms* support a cognate facilitation during word acquisition?**
:::
::: notes
- What makes two languages lexically similar is mostly the presence of *cognates*, or form-similar translation equivalents.
- *Cognates* are words that belong to two different languages, but share an equivalent meaning and are phonologically or orthographically similar
- A clear example of a cognate in Catalan and Spanish is *gat*-*gato* [cat]
- A clear example of a non-cognate is *gos*-*perro* [dog]
- There is some recent evidence that cognates are acquired earlier than non-cognates
- This would explain the facilitation effect of lexical similarity found by Floccia et al.
- But what are the mechanisms behind such a cognate facilitation effect?
:::
## Lexical access is language non-selective in bilinguals
{ width="150%" }
::: notes
- Current explanations of this effect stem from the language non-selective account of the bilingual lexicon
- This account states that during language comprehension and production, lexical representations from both languages are activated, even during monolingual situations
- For instance, when a Catalan-Spanish bilingual hears the Catalan word *cotxe*, not only the lexical representation of *cotxze* is activated, but also that of other phonologically related words in Catalan and Spanish
- There is extensive evidence in adults that this activation, which also percolates to semantic representations, influences the dynamics of word comprehension and production.
- There is some evidence of such parallel activation in infants.
- It has been suggested that this parallel activation of representations in both languages may facilitate the acquisition of cognates in infants, given that they are more strongly co-activated, given their higher similarity, compared to non-cognates.
- But this effect still lacks a mechanistic account
:::
## The present dissertation
::: columns
::: {.column}
::: fragment
::: box
**Study 1**
:::
1. Provide a mechanistic account for the **cognateness facilitation**
2. **Test predictions** of the model
> Under review in *Child Development* (R2), {{< ai psyarxiv >}} {{< ai github >}}
:::
:::
::: {.column}
::: fragment
::: box
**Study 2**
:::
3. Test core assumption of the model: **Language non-selectivity** in the initial lexicon
> In preparation {{< ai github >}}
:::
:::
:::
::: notes
- This has been the aim of the present dissertation
- In Study 1, we present and test the predictions of a mechanistic account of the cognate facilitation effect in bilingual word acquisition
- In Study 2, we test of the core predictions of the mode: that the initial bilingual lexicon operates in a language-non selective fashion
:::
# Study 1 {.inverse}
*Cognate beginnings to lexical acquisition: The AMBLA model*
## **A**ccumulator **M**odel of **B**ilingual **L**exical **A**cquisition (**AMBLA**)
::: box
1. **Accumulation** of information about **form-meaning mappings**:
<span style="color:#a80035;">**Learning instances**</span>: Exposure to a word-form that results in the accumulation of information about its meaning
:::: fragment
2. <span style="color:#00a857;">**Age of acquisition**</span>: The infant accumulates a <span style="color:#0040a8;">**threshold**</span> amount of learning instances for a word-form
::::
:::
. . .
$$
\begin{aligned}
\definecolor{myred}{RGB}{ 168, 0, 53 }
\definecolor{myblue}{RGB}{ 0, 64, 168 }
\definecolor{mygreen}{RGB}{0, 168, 87}
\definecolor{grey}{RGB}{128, 128, 128}
\textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\
{\color{mygreen}\text{Age of Acquisition}_{ij}} &= \{\text{Age}_i \mid {\color{myred}\text{Learning instances}_{ij}} = {\color{myblue}\text{Threshold}} \}\\
{\color{myred}\text{Learning instances}_{ij}} &= \text{Age}_i \cdot \text{Freq}_j \\
\end{aligned}
$$
::: notes
- I present here the *Accumulator Model of Bilingual Lexical Acquisition* or AMBLA
- This is an *accumulator* model, in that it assumes that infants learn words by gradually accumulating information about word-referent mappings.
- Information is collected by the child in the form of *learning instances*
- We define learning instances as exposure to tokens of the word that result in the successful accumulation of information about their meaning
- We assume that an infant acquires a word when they have accumulated some threshold amount of learning instances with
- The age at which this occurs is the acge of acquisition of the word
- The rate at which an infant accumulates learning instances is a function of two variables: they age (older infants will have accumulated more learning instance than younger infants), and the lexical frequency of the words (infants will accumulate learning instances faster for more frequent words than for less frequent words)
:::
## AMBLA: Simulating *monolingual* word acquisition
::: columns
:::: {.column width="40%"}
Catalan monolingual child
- <span style="color:#FFFFFF; background:#004AAD;">/'gos/ (Catalan), 100%</span>
**Parameters**:
$$
\begin{aligned}
\text{Threshold} = 300 \\
\text{Freq}_j \sim \text{Poisson}(\lambda = 50)
\end{aligned}
$$
::::
:::: {.column width="60%"}
::::: fragment
{width="100%"}
:::::
::::
:::
::: notes
- I will illustrate how AMBLA works with some simulations
- I will now simulate how a Catalan monolingual child (exposed 100% of the time to Catalan) acquires the Catalan word *gos*
- For illustration purposes, I have fixed some parameters: I assume that the threshold of learning instances for word acquisition is 300
- I also assume that infants encounter the word *gos* around 50 times per month, as drawn from a Poisson probabilistic distribution in each month
- In this particular simulation, the infant has acquired the word *gos* at 26 months
:::
## AMBLA: Simulating *monolingual* word acquisition
::: columns
:::: {.column width="40%"}
Catalan monolingual child
- <span style="color:#FFFFFF; background:#004AAD;">/'gos/ (Catalan), 100%</span>
**Parameters**:
$$
\begin{aligned}
\text{Threshold} = 300 \\
\text{Freq}_j \sim \text{Poisson}(\lambda = 50)
\end{aligned}
$$
::::
:::: {.column width="60%"}
{width="100%"}
::::
:::
::: notes
- If we simulate this acquisition trajectory 50 times, we find that, on average, a Catalan monolingual will acquire the word at 24 months of age
:::
## AMBLA: Simulating *bilingual* word acquisition
::: box
3. **Linguistic input divided** into two languages: **Catalan 60%, Spanish 40%**
<span style="color:#a80035;">**Exposure**</span>: Proportion of time exposed to the language of $j$ word
:::
. . .
Accumulation of learning instances, a function of <span style="color:#a80035;">**Exposure**</span> and *Frequency*.
. . .
$$
\begin{aligned}
\textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\
\text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\
\text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot {\color{myred}\text{Exposure}_{ij}}\\
\end{aligned}
$$
::: notes
- To extend this model to the bilingual case, we need to account for the fact that bilinguals' lingusitic input is split into two languages
- Now the rate of accumulation of learning instances is also a function of the childs' exposue to the language that a word belongs to
:::
## AMBLA: Simulating *bilingual* word acquisition
::: columns
:::: {.column width="40%"}
Catalan monolingual child
- <span style="background:#004AAD;color:white;">/'gos/ (Catalan), 100%</span>
Catalan/Spanish bilingual child
- <span style="background:#C8102E;color:white;">/'gos/ (Catalan), 60%</span>
- <span style="background:#FF9E1F;color:black;">/'pe.ro/ (Spanish), 40%</span>
**Parameters**:
$$
\begin{aligned}
\text{Threshold} = 300 \\
\text{Freq}_j \sim \text{Poisson}(\lambda = 50)
\end{aligned}
$$
::::
:::: {.column width="60%"}
::::: fragment
{width="93%"}
:::::
::::
:::
::: notes
- As I did before I will now simulate how a Catalan/Spanish bilingual child might acquire the Catalan and Spanish words *gos* and *perro*
- For illustration purposes, I will assume that this bilingual is Catalan-dominant, as they are exposed to 60% of the time to Catalan, and 40% of the time to Spanish
- Everything else remains the same
- I will also show the trajectory of acquisition for the monlingual child for reference
- As we can see, the bilingual child acquires the Catalan word first, and the Spanish word later
- Overall, the bilingual child acquires both words than the Catalan monolingual acquires the Catalan word
- This is the result of the bilingual child's dual language exposure
:::
## AMBLA: Simulating *bilingual* word acquisition
::: columns
:::: {.column width="40%"}
Catalan monolingual child
- <span style="background:#004AAD;color:white;">/'gos/ (Catalan), 100%</span>
Catalan/Spanish bilingual child
- <span style="background:#C8102E;color:white;">/'gos/ (Catalan), 60%</span>
- <span style="background:#FF9E1F;color:black;">/'pe.ro/ (Spanish), 40%</span>
**Parameters**:
$$
\begin{aligned}
\text{Threshold} = 300 \\
\text{Freq}_j \sim \text{Poisson}(\lambda = 50)
\end{aligned}
$$
::::
:::: {.column width="60%"}
{width="100%"}
::::
:::
::: notes
- If carry out this simulation 50 times, we can see that on average, the bilingual child acquires both words at later ages than the monolingual child
- This difference is specially large for the acquisition of the word in the lower-exposure language
- But this model does not include any kind of cognate facilitation effect
- I will now illustrate how we have implemented this effect, and I will show some simulations of the acquisition of a cognate word
:::
## AMBLA: Simulating a *cognate facilitation*
::: box
4. Words may accumulate additional learning instances from the **co-activation** of their (phonologically similar) **translation equivalent**
Degree proportional to their phonological similarity (<span style="color:#a80035;">**Cognateness**</span>)
:::
. . .
$$
\begin{aligned}
\textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\
\text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\
\text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot \text{Exposure}_{ij} + \\
&({\color{myred}\text{Learning instances}_{ij'} \cdot {\text{Cognateness}}_{j}})\\
\textbf{where:} \\
{\color{myred}\text{Cognateness}_{j,j'}}&{\color{myred} = \text{Levenshtein}(j, j')}
\end{aligned}
$$
::: notes
- Following the language non-selective account of the bilingual lexicon, we assume that infants co-activate translation equivalents in both languages, even in monolingual situations
- The strength of this co-activation is a function of the phonological similarity between both word-forms
- During language exposure cognate words will be more strongly co-activated than non-cognate words
- As a result of this co-activation, when an infant encounters a given word-form in their speech inpt, they may not only accumulate a learning instance for such word, but also for its translation equivalent
- The degree to which the translation accumulates this additional learning instance is a function of its phonological similarity between both word-forms
- This is expressed in the model by adding a new term to the learning instance accumulator
- This term adds the number of learning isntances that the translation eequivalent of the word has acquired, weighted by the phonological simlarity between them
- This phonological similarity is measured using Levenshtein distance between both phonological transcriptions
- As a result, identical cognates will accumulate cross-lingusitic learning isntances with every exposure to either translation, while words with no phonological similarity will not benefit from any additional learning instances
:::
## AMBLA: Simulating a *cognate facilitation*
::: columns
:::: {.column width="40%"}
Catalan monolingual child
- <span style="background:#004AAD;color:white;">/'gat/ (Catalan), 100%</span>
Catalan/Spanish bilingual child
- <span style="background:#C8102E;color:white;">/'gat/ (Catalan), 60%</span>
- <span style="background:#FF9E1F;color:black;">/'ga.to/ (Spanish), 40%</span>
**Parameters**:
$$
\begin{aligned}
\text{Threshold} = 300 \\
\text{Freq}_j \sim \text{Poisson}(\lambda = 50) \\
\text{Cognateness}_{j,j'} = 0.75
\end{aligned}
$$
::::
:::: {.column width="60%"}
::::: fragment

:::::
::::
:::
::: notes
- Let's visualise this with more simulations
- I've simulated the acquisition of the cognate words *gat* and *gato* in Catalan and Spanish by the same monolingual and bilingual children
- These words share 75% phonological similarity, so we would expect both words to accumulate learning instances faster now
- As we can see, the bilingual now acquires the Catalan and Spanish words at a more similar age than the Catalan monolingual does with the Catalan word
- This difference is specially large in Spanish, the lower-exposure language
:::
## AMBLA: Simulating a *cognate facilitation*
::: columns
:::: {.column width="40%"}
Catalan monolingual child:
- <span style="background:#004AAD;color:white;">/'gat/ (Catalan), 100%</span>
Catalan/Spanish bilingual child:
- <span style="background:#C8102E;color:white;">/'gat/ (Catalan), 60%</span>
- <span style="background:#FF9E1F;color:black;">/'ga.to/ (Spanish), 40%</span>
**Parameters**:
$$
\begin{aligned}
\text{Threshold} = 300 \\
\text{Freq}_j \sim \text{Poisson}(\lambda = 50) \\
\text{Cognateness}_{j,j'} = 0.75
\end{aligned}
$$
::::
:::: {.column width="60%"}

::::
:::
::: notes
- If we carry out this simulation 50 times, we confirm that a cognateness effect, as implemented in AMBLA, predicts a closer age of acquisition for both words
- The word in the lower-exposure language (Spanish in this case), benefits more strongly from its cognate status, as it receives additional learning instances from its more frequent translation more often
:::
---
### Predictions
::: box
1. Cognates acquired **earlier** than non-cognates
2. Cognateness facilitation stronger in the **lower-exposure language**
:::
::: notes
- In summary, the AMBLA model generates two predictions
- First, that bilinguals acquire cognates earlier than non-cognates
- That this facilitation effect is larger in the lower exposure language, compared to the higher-exposure language
:::
---
### Predictions
::: box
1. Cognates acquired **earlier** than non-cognates
2. Cognateness facilitation stronger in the **lower-exposure language**
:::
### Barcelona Vocabulary Questionnaire (BVQ)
::: columns
:::: {.column width="50%"}
::::: fragment
{ width="100%" }
:::::
::::
:::: {.column width="50%"}
::::: fragment
<br>
- Online, open source {{< fa brands r-project >}} {{< fa brands github >}}
- $\approx$ 1,600 words (800 Cat., 800 Spa.)
- 4 sublists, random allocation
:::::
::::
:::
::: notes
- To test these predictions, we designed an online vocabulary questionnaire (the BVQ)
- This questionnaire, which is inspired by the CDI, includes a language exposure survey, a demographic survey and two vocabulary checklists
- One vocabulary checklist icludes words in Catalan, and the other includes their translations to Spanish
- In each checklist, caregivers are presented with a list of words
- Caregivers are asked to mark, for each word, whether their child can *understand* or *understand and say* it
- The questionnaire is open-source, and is publicly available on GitHub
:::
## Results: Comprehension
366 participants (12-32 mo), 436 administrations $\times$ 604 noun words
Ordinal, multilevel (Bayesian) regression model
$p(\text{Comprehension}, \text{Production}) \sim \text{Exposure}_{ij} \cdot \text{Cognateness}_j$
. . .

::: notes
- We gathered 436 questionnaire administrations from 366 12-32 month-old infants learning Catalan and or Spanish in the Metropolitan Area of Barcelona
- We modelled the probabiity that each child would understand or produce each of the words
- We used on ordinal Bayesian multilevel model that included several predictors of interest
- The critical predictors were the main effect of $Cognateness$, and its interaction with $Exposure$
- First I am going to present the results for *comprehension*
- In this figure I am showing the posterior predicted probability of comprehension across ages for three average word-forms
- I am showing predictions for an identical cognate with 100% phonological similarity with its translation
- I am also showing the predictions for a non-cognate with 50% similarity, and 0% similarity
- Predictions are generated for word acquisition in the lower-exposure language (to the left) and for the higher-exposure language (to the right)
- In the lower-exposure language, we observed a substantial facilitation effect of cognateness: the probability of acquisition grows faster for cognates
- In the higher-exposure language, this effect was not observed
:::
## Results: Production
366 participants (12-32 mo), 436 administrations $\times$ 604 noun words
Ordinal, multilevel (Bayesian) regression model
$p(\text{Comprehension}, \text{Production}) \sim \text{Exposure}_{ij} \cdot \text{Cognateness}_j$

::: notes
- We found similar results in Production
- Participants were able to produce the words at later ages than in comprehension, as expected
- We also observe a substantial facilitation effect of cognateness, only in the lower exposure language
:::
## Discussion
**Earlier acquisition** for **cognates** vs. non-cognates
. . .
Cognate facilitation **moderated by exposure**
> Only words from the lower exposure benefit from cognateness
. . .
Cognateness as a candidate mechanism underlying Floccia et al.'s results
. . .
Cross-language facilitation via co-activation of phonologically similar translation equivalents
. . .
::: box
**Is language-non selectivity already present in the initial lexicon?**
:::
::: notes
- In summary, we found an earlier acquisition of cognates than non-cognates, in line with the predictions of AMBLA
- This effect was modulated by quantitative language-exposure
- Only the acquisition of words from the lower-exposure language benefitted from their cognate status
- This cognate facilitation provides an candiate mechanism for the results in Floccia et al.
- We presented a mechanistic account for this effect in the form of a model, the AMBLA model
- In this model, we propose that the cognate facilitation effect is the result of a cross-language accumulation of learning instances, in line with the language non-selective account of the bilingual lexicon
- One core assumption of this model is that bilingual infants activate phonological word-forms in both languages in parallel, even in monolingual situations
- In Study 2, we tested the plausibility of this assumption
:::
# Study 2 {.inverse}
Developmental trajectories of bilingual spoken word recognition
<br>
{width="25%"}
{width="25%"}
::: notes
- In this second study, we examined the language non-selectivity of the initial lexicon to test
- We implemented a spoken word recognition paradigm
:::
## Language non-selectivity in the initial lexicon
<br><br>
Some evidence in infants and children [e.g., @vonholzen2012language; @singh2014one]
. . .
Methodological pitfalls: "Bilingual" task
. . .
One language is task relevant, the other is covertly activated
::: notes
- Although there is some evidence of language non-selectivity in the initial lexicon, most literature presents one methodological caveat
- The experimental paradigms used frequently present participants with words from both languages, even within the same trial
- This may be introducing bilinguals in a bilingual context in which the baseline activation of lexical representations in both languages is increased
- This may lead to cross-language interactions that may not be the result of the experimental manipulation of interest (e.g., priming), but rather from such overall higher activation
:::
## Implicit naming task
::: box
Mani and Plunkett (2010, 2011)
:::
{width="80%"}
## Implicit naming task
::: box
Mani and Plunkett (2010, 2011)
:::
::: columns
:::: {.column}
{width="80%"}
::::
:::: {.column}
::::: incremental
* Chance-level target looking in related trials
* Prime-Target phonological **interference**
* **Implicit naming** of prime pictures
:::::
::::
:::
## Implicit naming task
{width="80%"}
## Study 2: Design

## Study 2: Design

## Study 2: Design

## Study 2: Design
Cross-language priming effects are short-lived
Change in design:
> Auditory label *before* target-distractor images
Increased temporal proximity of prime and target
## Study 2: Design

## Study 2: Design
{ width="90%"}
## Data collection timeline

## Predictions
::: columns
:::: {.column width="50%"}
::::: fragment
:::::: box
**Exp. 1: Monolinguals**
::::::
Replicate **within-language phonological interference** from Mani and Plunkett (*proof of concept*)
:::::
::::
:::: {.column width="50%"}
::::: fragment
:::::: box
**Exp. 2: Monolinguals and bilinguals**
::::::
If **language non-selectivity**, **stronger interference** in cognate vs. non-cognate trials
:::::
::::
:::
## Data collection timeline

## Experiment 1: Results, Bayesian GAMMs
::: columns
:::: {.column width="30%"}
::::: box
English monolinguals
:::::
> 79 participants, 89 sessions
**No evidence of phonological priming**
Related trials $\approx$ Unrelated trials
::::
:::: {.column width="70%"}

::::
:::
## Experiment 2: Results, Bayesian GAMMs
::: columns
:::: {.column width="30%"}
::::: box
Catalan/Spanish monolinguals
:::::
> 77 participants, 107 sessions
**No evidence of phonological priming**
Related trials $\approx$ Unrelated trials
Cognate trials $\approx$ Non-cognate trials
::::
:::: {.column width="70%"}

::::
:::
## Experiment 2: Results, Bayesian GAMMs
::: columns
:::: {.column width="30%"}
::::: box
Catalan/Spanish bilinguals
:::::
> 78 participants, 133 sessions
**No evidence of phonological priming**
Related trials $\approx$ Unrelated trials
Cognate trials $\approx$ Non-cognate trials
::::
:::: {.column width="70%"}

::::
:::
## Discussion
<br>
**Successful spoken word recognition** across ages and language profiles
. . .
**No evidence of priming effects**, within or across languages
> Unsuccessful retrieval of prime phonological forms?
. . .
Inconclusive results, revise design
# General discussion {.inverse}
## Summary
**Cognateness facilitates word acquisition** in the **lower-exposure language**
. . .
Candidate **mechanism** behind bilingual vocabulary growth
> AMBLA: Cross-language accumulation of learning instances
. . .
**Language non-selectivity** in the initial lexicon: Pending testing
---
## Discussion
::: columns
:::: {.column}
::::: box
Discussion
:::::
::::: incremental
- Cognateness facilitation effect [@siow2022effect; @mitchell2023cognates; @tan2024role]
- Candidate explanation for Floccia et al. (2018) and Hoff et al. (2012)
- Standard Model of Language Acquisition [@kachergis2022standard]
:::::
::::
:::: {.column}
::::: fragment
:::::: box
Future steps
::::::
:::::: incremental
- The impact of cognateness in spoken word recognition: *Re-analysing data from Study 2*
- More language pairs (lower overall similarity)
- Train AMBLA model on vocabulary data
::::::
:::::
::::
:::
## Whats not in this dissertation
::: columns